AMERICAN 
STATISTICAL ASSOCIATION. 


NEW SERIES, No. 92. DECEMBER, 1910. 


THE CORRELATION OF ECONOMIC STATISTICS. 


By Warren M. Persons, Assistant Professor of Economics, Dartmouth 
College. 


Economics deals mainly with correlation rather than with simple causa- 
tion (p. 287); necessity of a method of measuring the correspondence between 
two series of statistics (p. 288); illustrations of the use of the graphic method 
(p. 290) ; the coefficient of correlation defined and illustrated (p. 298) ; the coeffi- 
cient of correlation as a measure of the grouping of points about the line of 
regression (p. 303) ; the equation of the line of regression (p. 303); the coeffi- 
cients of correlation computed for the illustrations cited above (p. 306); two 
influences affect the size of the coefficient of correlation, i.e., short-time fluctua- 
tions and secular tendency (p. 306); three methods of isolating the two influ- 
ences are described and illustrated (p. 310) ; the measurement of the correlation 
among three variables (p. 319); conclusion (p. 322). 


The cause and effect relation existing between economic 
events is especially difficult to ascertain because of the pres- 
ence of innumerable variable elements. In solving his prob- 
lems the economist ean not, like the physicist or chemist, 
eliminate all causes except one and then by experiment 
determine the effect of that one. Causes must be dealt with 
en masse. Since any effect is the result of many combined 
causes the economist is never sure that a given effect will 
follow a given cause. In stating an economic law he always 
has to postulate “other things remaining the same,” with, 
perhaps, little appreciation of what the other things may be. 
It is rarely, if ever, possible for the economist to state more 
than “such and such a cause tends to produce such and such 
an effect.”” Events can only be stated to be more or less 
probable. He is dealing mainly, therefore, with correlation 
and not with simple causation. 


A 


288 American Statistical Association. [2 


The problems of economics are similar to certain problems 
of biology, such as the effect of environment and heredity 
upon the individual. In dealing with the question of heredity 
Karl Pearson says:* “Taking our stand then on the observed 
fact that a knowledge neither of parents nor of the whole 
ancestry will enable us to predict with certainty in a variety of 
important cases the character of the individual offspring we 
ask: What is the correct method of dealing with the problem 
of heredity in such cases? The causes A, B, C, D, E 

which we have as yet succeeded in isolating and defining 
are not always followed by the effect X, but by any one of 
the effects U, V, W, X, Y, Z. We are therefore not dealing 
with causation but correlation, and there is therefore only one 
method of procedure possible; we must collect statistics of the 
frequency with which U, V, W, X, Y, Z, respectively, follow 
on A, B,C,D,E . . . From these statistics we know the 
most probable result of the causes A, B, C, D, E and the 
frequency of each deviation from this most probable result. 
The recognition that in the existing state of our knowledge the 
true method of approaching the problem of heredity is from 
the statistical side, and that the most that we can hope at 
present to do is to give the probable character of the offspring 
of a given ancestry, is one of the great services of Francis 
Galton to biometry.” 

Just as the biologists cannot predict a man’s height or color 
of eyes or temper or combativeness by knowing those quali- 
ties in his ancestors, so economists cannot predict that a defin- 
ite call rate in Wall Street will go with a given percentage of 
reserves to deposits in New York banks or that a given supply 
of wheat will result in a definite price per bushel. But, on the 
other hand, just as it has been observed that there 7s a rela- 
tion existing between a man’s stature and the stature of his 
ancestors, so it has been observed that a relation does exist 
between bank reserves and call rates and between supply of 
wheat and its price per bushel. 

In order to deal in a satisfactory way with such questions 
as those given above it is necessary to accumulate statistics of 
the supposedly related phenomena. In order to have those 


*The Law of Ancestral Heredity, Biometrika, Vol. II, p. 215. 


‘ 
‘ 
> 
i \ 
| 


3] The Correlation of Economic Statistics. 289 


statistics indicate anything it is necessary to obtain a method 
of measuring the extent of correlation between the phenomena. 
The commonly used method of measuring the amount of 
correlation between any two series of economic statistics is to 
represent the two series graphically upon the same sheet of 
cross-section paper and then compare the fluctuations of one 
series with those of the other. The quantity theory of prices 
has been tested in this way by Dr. E. W. Kemmerer.* Dr. 
Kemmerer builds up the following price equation: 
p—MR+CR. t 
NE+N.E, 
in which: 
P,=the average price (weighted by the total 
flows) of all commodities sold for money 
and deposit currency during a unit of time. 


M =the total currency in circulation during the 


unit of time. the 
R=the average number of times each unit of ¢ rg 


currency changes hands during the unit | ,,,_ 
of time. rency 


NE=the flow of goods exchanged for currency. 


C=the volume of deposit currency exchanged ) = 
for goods. flow of 
R.=the average rate of turnover of such deposit ‘ — 


currency. rency. 


N.E.=the flow of goods exchanged for deposit cur- 
rency. 


Dr. Kemmerer then attempts to find the answer that facts 
give to the following questions: 


1. Do the bank reserves vary directly with the money 
supply? 

2. Does the proportion of bank reserves to check circula- 
tion vary directly with the degree of business distrust existing 
in the country? 


: — and Credit Instruments in their Relation to General Prices, Henry Holt and Co., 


tSee Quarterly Journal of Economics, Feb., 1908, p. 274, for derivation of equation. In the 
current article, I quote from the review of Doctor Kemmerer’s book. 


290 American Statistical Association. [4 


3. Is ‘‘a relative increase in the circulating media accom- 
panied by a corresponding and proportionate increase in gen- 
eral prices and a relative decrease in the circulating media, by 
a corresponding and proportionate decrease in general prices,”’ 
or, in the language of the formula, is 


Pp _MR+CR. * 
NE+N.E. 
borne out by the facts? 


All of the questions to be tested by the statistics collected 
are questions of correlation. Dr. Kemmerer makes the tests 
graphically, as has been stated, by comparing the fluctuations 
of the two curves based upon the pair of series of statistics 
being considered. The charts presented by Dr. Kemmerer 
from which his conclusions are drawn are given below. 

In the case of the correlation, of bank reserves and money 
in circulation, inclusive of bank reserves, Dr. Kemmerer con- 
cludes, ‘There can be no question but that when due allow- 
ance is made for fluctuations in business confidence, the evi- 
dence of Chart I strongly supports the contention that there 
exists a close relationship between the amount of money in 
circulation and the amount of the country’s bank reserves.” f 
In the case of the correlation of business distrust and the ratio 
of bank reserves to check circulation the conclusion is, “the 
chart substantiates the contention . . . that the ratio 
of check circulation to bank reserves is a function of business 


*Money and Prices, p. 139. 
Dr. Kemmerer uses statistics of the United States for the period 1879-1904 to make 
his inductive tests. The statistics of total bank reserves he obtains from the Report of 
the Comptroller of the Currency. For the amount of money in each year (M) he takes 
the average of the total money in circulation at the beginning and end of that fiscal year 
as given in the Statistical Abstracts. The check circulation for each year he obtains by 
multiplying the total bank clearings in each year by 4°. This ratio is the ratio between 
the estimated total check circulation for 1896 and the bank clearings for that year. The 
rapidity of circulation of 47 per year he derives by dividing jthe estimated total money 
transactions in 1896 by the money circulation of that year. The figures for the growth 
of business he finds by taking the simple average of index numbers of fifteen different 
series of statistics taken as representing the industrial activity of the year considered. 
The index numbers of business distrust are the simple averages of the corresponding 
indices for the proportion of concerns failing, and the average liabilities of concerns fail- 
ing. ‘The general index figures of prices and wages were computed by combining in & 
weighted average the index figures for the prices of railroad securities (Commons), the 
index figures for the prices of wholesale commodities (Commons), and the index figures 
for wages (Department of Labor tables for twenty-five occupations).” 


tIbid., p. 143, 


CHART I. 


; 
is 
“ig 
i 
| 


3 
> 
§ 


PATSNTOUL 


yueg pue ud 


DTYeTNOATO 


‘I 


\ 
vi 
‘ 
\ 
\ 
\ 
a} 
“ 
1 
ae 
: 
= 
> 
“~ @ 
¢ 
a 
' 
OO : 
>> 
Oo = 
—N 


x 
3 
RQ 
3 
= 
<x 


D6T 668 T 


OF SSATOROY yueg jo 


68 


OT 


NSNTISTG SSAUTSNG 


yoouy 


04 
pue 


sseu 


yueg JO oT’ 


rsng 


‘Il LUVHO 


292 [6 
| 
| 
a 
= 
| 
< 
| 
B 
/ 
o 
: 
— 
= 
4 
2 
ON 
os 
oO 


*“SOOTIg 


SUOTRETNOATO SCATYCTOY 


D 
‘3 
= 
> 
= 
~ 


"SeOTdg pte UOTYL[NOATO SATIeTOY 


7] 293 
; 
ro) = - 


294 American Statistical Association. [8 


confidence . . .”’* The final test of the quantity theory 
is the amount of correlation between the figures for the right 


_ MR+CR. 
and left-hand sides of the equation P,= NETNE,’ Upon 


examination of the curves plotted from the two series of sta- 
tistics representing general prices and relative circulation (the 
left and right-hand sides, respectively, of the price equation) 
Dr. Kemmerer concludes, ‘“The general movement of the two 
curves taken as a whole is the same, while the individual 
variations from year to year exhibit a striking similarity.” f 

The graphic method of comparing fluctuations is well 
enough as a preliminary, but does it enable anyone to tell any- 
thing of the extent of the correlation between the series of figures be- 
ing considered? Is Dr. Kemmerer warranted in deducing his 
conclusions from observation of the charts? It seems to the 
writer that one opposing the quantity theory might draw 
opposite conclusions with as much (or as little) reason. The 
charts do not answer the questions proposed. The painstaking 
collection of statistics to test correlation is useless if there be 
no more reliable method to measure correlation. A numerical 
measure of the correlation must be found if we wish to de- 
termine the eztent to which the fluctuations of one series 
synchronize with the fluctuations of another series. 

A second illustration of a conclusion based upon graphic 
representation is that of Ira Cross in his study of strike sta- 
tistics.[ He says, upon consideration of data taken from 
the Twenty-first Annual Report of the United States Bureau 
of Labor, ‘‘the percentage of successful strikes decreases dur- 
ing periods of business prosperity and increases during ‘hard 
times.’ ”’ In the accompanying charts the per cent. of estab- 
lishments in which strikes were successful is plotted, first, 
with the per capita exports and imports and second, with index 
numbers of wholesale prices. The foreign trade and the 
price statistics are taken as indicative of the activity of busi- 
ness, as indices of prosperity. 

*Ibid., p. 146. 


tIbid., p. 147. 
{Quarterly Publications of the American Statistical Association, June, 1908, p. 168. 


a 
Ke 
4 
4 
‘ 
[= 
+ 


The Correlation of Economic Statistics. 


Per capita Per cent of 
| Successful 
strikes. 


isports plus 
exports. 


x - 680 


a 1685 1695 1900 1905 


PER CENT OF ESTABLISHWENTS IN WHICH STRIKES WERE SUCCESSFUL. 


PBR CAPITA IMPORTS PLUS EXPCRTS, 


‘ 
4 
‘ 
26 70 
at 
’ 
‘ 
26 60 
24 ‘ 50 
‘ 
J 
' 
22 40 


American Statistical Association. 


Per Cent of 
Successful 
Strikes 

80 


1885 1890 14 
Index Numbers of Prices, 


Per Cert of Estabdlisazents in which 
Strikes were Successful. 


L10 
Index 
Numbers 
of 
Prices 
| 
if 
110 
. 
| 1 
\ 
\-f 
/ 
| / 
‘ 
4 
| 
: 70 
1881 195 1900 1905 
— 
‘i 


11] The Correlation of Economic Statistics. 297 


A third illustration of a conclusion relating to correlation 
is taken from the London Statist of April 4, 1908, where the 
proposition is made that, “ When commodities advance prices 
of Stock Exchange securities recede; when commodities recede 
Stock Exchange securities advance.” The proposition is sup- 
ported by reference to the following chart showing the yearly 
average price of consols and Sauerbeck’s index numbers of 
prices. 


Sauerbeck's 
Price of Index 
Consols Numbers 


1860 1870 1880 
Yearly Average Price of Consols. 


Sauerbeck's Index Numbers. 


(1951 - 1908 ) 


| 
1 
0 

110 20 
10 
i 
104 feds 00 
| 
: 90 
: 
. 
j 
a 80 
9) 6 


298 American Statistical Association. [12 


The foregoing illustrations show the need by economists of 
a quantitative measure of correlation. Such a measure has 
been widely used in biological statistics and used to a limited 
extent in economic statistics.* G.U. Yule has used the 
measure in his study of “‘ Pauperism;’t R. H. Hooker has used 
it in his “Correlation of the Weather and Crops;’t J. P. 
Norton applied it in his study of the ‘“‘New York Money 
Market.”’ This measure, the coefficient of correlation, will be 
computed for the data upon which the conclusions quoted 
above are based. The formula for the coefficient of correlation 


where: 


x = deviation from arithmetic mean = X — M, 
y =deviation from arithmetic mean= Y— My, 
«,=standard deviation of X series 
¢,=standard deviation of Y series 
n=number of items. 


The coefficient of correlation “‘serves as a measure of any 
statement involving two qualifying adjectives, which can be 
measured numerically, such as ‘tall men have tall sons,’ 
‘wet springs bring dry summers,’ ‘short hours go with high 
wages.’ ”’ § It is not the purpose in what follows to go through 
the mathematical derivation of the coefficient* of correlatior, 
but to test the formula empirically in order to ascertain how it 
actually varies for given series of statistics and to point out 
some of its features. 

However, it should be noted at this point that the coeffi- 
cient of correlation is not empirical but was derived by 4 
_priort reasoning. It was found by assuming that a large 

number of independent causes operate upon each of the two 

*See the very interesting paper by G. U. Yule} on the Applications of the Method of 
Correlation to Social and Economic Statistics. (Journal of the Royal Statistical Society, 


December, 1909.) This paper gives a history of the application of the methed of correlation 
with a bibliography of works in which it has been applied to economic and social statistics. 


tJournal of the Royal Statistical Society, 1899, Vol. 62, pp. 249-286. 
tJournal of the Royal Statistical Society, March, 1907, p. 1. 
§Bowley, A. L., Elements of Statistics, p. 320. 


Ps 
4 
r= 
$ 


13] The Correlation of Economic Statistics. 299 


‘ 


series X and Y, producing normal distributions in both cases. 
Upon the assumption that the set of causes operating upon the 
series X is not independent of the set of causes operating upon 
the series Y the value r= a is obtained. This value 
becomes zero when the operating causes are absolutely inde- 
pendent. Hence the value of r was taken as a measure of 
correlation.* In what follows no assumption concerning 
the type of distribution of the X and Y series will be made. 

Some appreciation of the meaning of the coefficient of 
correlation can be obtained by the consideration of a few 
simple applications. Suppose that we consider the two series 
of measurements: 


X=1, 2, 3, 4, 5 Mi= 3 
Y=6, 8, 10, 12, 14 M:2=10 

x y x? y? xy 


In the above illustration the numbers were chosen so that 
for an increase of 1 unit in the X series there is an increase of 
2 units in the Y series. Thus the correlation is perfect and 
requals +1. If the Y series had been 14, 12, 10, 8, 6 (the 
X series remaining the same) the value of r would have been 
-1l. Thus —1 stands for perfect negative correlation, an 
increase in one series corresponding to a decrease in the 
other. It should also be noted in this connection “that the 
coefficient of correlation (r) cannot be less than —1 nor more 
than 


*See Yule, Journal of the Royal Statistical Society, Vol. 60, p. 839; Bowley, Elements 
of Statistics, p. 316; Elderton, Frequency Curves and Correlation, p. 106. 

tProof of this statement may be conveniently found in Bowley, Elements of Statistics, 
p. 319. 


| 


300 American Statistical Association. [14 


The above illustration suggests the question, “Will a 
linear relationship between X and Y always give perfect 
correlation?” 


Assume the linear relationship 
Y=aX+b 
Since y= Y—M, and x= X—M, 
=a(x+M;)+b or y =ax 
(since b—aM,;—M2=0) 
(The sign of r depends upon the sign of a.) 
Therefore a linear relationship between two variables will 
give a correlation coefficient of +1 or —1 depending upon 
whether large values of one occur with large values of the 
other or large values of one occur with small values of the 
other. 
The converse of the above proposition is likewise true, 7. e., 
if the coefficient of correlation (r) equals 1 then the relationship 
between the X and Y series is linear. 


and r= 


Assume r=1 
then (2xy)? 
Letting x1=/1y1, . . X»=Auyn the above expression 
becomes 
Yi? yo?(Ar — 42)? + — 43)? + eee ..-=0 
The only way in which this expression can equal zero 
is by having 
=A, 
and it follows that 
Xi =Aryi, X2=Arye Xn =ArYn 
or 
x=Ay 
which cenotes a linear relationship between X and Y. 
That any relation other than a linear one will not lead to 
r=1 is illustrated by the following: 


Let Y=X? 
X=1, 2, 3, 4, 5, 
Y=1, 4, 9, 16, 25, 


. 
Aad 
3 
M.=11 


Although the two series increase regularly, so that devia- 
tions of like signs always correspond, yet the correlation is not 
perfect because a linear relation does not exist between X and Y. 

If the number of items in each series be increased to 11 and 
the Y items remain squares of the X’s the value of r will be 
0.974. 

If there be no law connecting the X and Y series the products 
of the deviations (xy) are as apt to be negative as positive. The 
expression =xy will therefore tend to approach zero. With a 
very large number of measurements the correlation coefficient 
will approximate zero. 

From the condition of no relationship to the condition of a 
linear relationship existing between the pair of series of meas- 
urements the correlation coefficient varies from 0 to +1. 

Suppose that we are investigating the relation existing 
between two series of measurements X and Y. Let points be 
plotted on cross-section paper whose coérdinates are corres- 
ponding measurements X, and Y;. If there be a relationship 
existing between the two series, the points thus located will 
not lie chaotically all over the plane, but they will range them- 
selves about some curve or locus. This curve, which has been 
called the curve of regression, is illustrated in the accompanying 
diagram (p. 302).* The straight line best fitting the points is 
called the line of regression.t 


*Distributions similar to that of the diagram may be found in Natural Selection in 
“Helix Arbicstorum,” A. P. di Cesuola, Biometrika, Vol. V, Part IV, p. 392, and Anthrop- 
ometry of Scottish Insane, J. F. Tocher, Biometvika, Vol. V, Part III. 

tSo named by Francis Galton from the fact that if any group of men be picked out for an 
exceptional characteristic (say height) any other characteristic of those men correlated 
with height will regress toward the normal, or in other words the second characteristic will 
be less abnormal than the first. 


4 15] The Correlation of Economic Statistics. 301 | 
a _ 
t x y x? y? xy 
-2 —10 4 100 20 . 
-1 -7 1 49 7 o,=1.41 a. 
0 -2 0 4 0 62 =8.65 
+1 +5 1 25 5 r=0.981 . 
+2 +14 4 196 28 P 
374 60 


American Statistical Association. 


— 


302 [16 
- 
. 
* 
«© 
. 
. 


17] The Correlation of Economic Statistics. 303 


For example suppose we consider the two series of index 
numbers for the period 1879-1904 inclusive, representing (1) 
money in circulation in the United States inclusive of bank 
reserves, and (2) bank reserves.* Let points be located with 
abscissas proportionate to the money in circulation and 
with ordinates proportionate to the bank reserves of the same 
year. The chart on the next page shows that these points 
lie near a straight line, the line of regression. 

The coefficient of correlation (r) is a measure of the closeness 
of the grouping of the points about this line of regression. If the 
points should ali range themselves on a line then r would equal 
+1 or —1 depending upon whether, looking left to right, the 
line sloped upward or downward. 

We will now derive the equation of the line of regression. 
Let X and Y be associated measurements and x and y be 
associated deviations from the respective arithmetic means. 
A linear relation between the measurements is of the form 

Y = ayX +h; 
The relation between the deviations will be of form 
y =a:x or 

Since all of the points are not located exactly upon a straight 
line the substitution of the values x: yi, X2 Ye, etc. in the equa- 
tions will give residues vi, V2, etc. as follows: 

aim, 
Y2—41X2=Ve 
Yu—&1 Xn=Va 
Vi V2 Va 
The values Vita’ °° Vitae equal 
the distances of the various points to the straight line y = a:x. 

The equation of a line such that the sum of the squares of 
the distances from the given points is a minimum will now 
be found. In other words that value of a; will be taken which 
makes . . . . minimum. To find the 
value of a;, for which (yi—aix:)?+(ye—aixe)*+ . ... 

+(y,—a:x,)? will be a minimum, differentiate with respect to 
a, and obtain —2x;(y1—aix;) —2xe(ye—aixe) 
(y.—a)x,). In order that the original function be a mini- 
mum, this derivative must equal zero. We will then have 


"Kemmerer, Money and Prices, p. 141. 
2 


American Statistical Association. 


Bank Reserves, 


Money in circulation inclusive of 
bank reserves plotted as abscissas, and 
Bank Reserves as ordinates. 

(Columas I and Il on p. 141 of 

"Money & Price", Kemserer, years 

1879-1904 inclusive. ) 


LINE OF REGRESSION 
y s @ 46.3 


Money ia Circulation. 


150 140 160 


160 “too 


5 250 
a 
200 
e 
150 
° 
e 
100 
ee 
e e 
60 80 100 


19] The Correlation of Economic Statistics. 305 


(xy — 1X1?) =0, or 
Dxy —ai2x*=0 


_ 

Similarly if x =agy, then —a,Ly?=0 will give the value 
of a2 for which the sum of the squares of the distances of the 
given points to the straight line X =a,Y+be is a minimum, or 


ai 


Let r=—= and 2x? =n«;? Dy? 


The equations between the deviations are: 


It may seem that the two equations just given are incon- 
sistent. But it must be remembered that these equations do 
not give the relationship existing between any corresponding 
pair of deviations unless all of the points lie exactly on a 
straight line and there be perfect correlation. For all cases of 
imperfect correlation a given deviation x will occur with several 
different deviations y (if we have a large number of measure- 
ments). If these deviations y are distributed according to the 
normal law of distribution then the given value x substituted 
in the first equation will give the mean of the deviations occur- 
ing with the deviation x and if a given value y be substituted 
in the second equation the value of x resulting will be the 
mean of the deviations of the associated characteristics. 


Since y= Y— Mz andx=X—M;, 
Y=M.+r= (X—Mi) 
1 


and 
2 


The coefficients r= and ra are called the coefficients of 
1 2 


regression of Y upon X and of X upon Y respectively. The 


%2 
y=r—x 
° a 4 


306 American Statistical Association. [20 


first coefficient (r=?) and the reciprocal of the second ( =) 
1 1 


are the slopes of the lines of regression. If X and Y be meas- 
ured in terms of their respective standard deviations as units 


the slopes of the lines of regression will be r and 4 In other 


words, the slope of the line of regression of Y upon X, each series 
being measured in terms of its standard deviation, is equal to 
the coefficient of correlation for the two series. For perfect posi- 
tive correlation the line would make an angle of 45° with the X 
axis for perfect negative correlation the line would make an angle 
of 135° with the x axis, and for no correlation the line would 
be parallel to the x axis. 

In the preceding we have fitted a straight line to the points. 
In some cases the points may arrange themselves along some 
locus such as could be represented by y=a+bx+cex?+ 
. . or y=a+be* or any other empirical function. The 
work in fitting such a curve to the points would be great and 
the advantages are not sufficient to compensate for the addi- 
tional work.* 

The following table gives the correlation coefficients for the 
various pairs of series of statistics for which Dr. Kemmerer has 
attempted to show correlation by means of charts: 


\Coefficient of Corre- 
Statistics of Period Covered. | lation 
i + Probable Error. 
{Mo in inclusive of bank reserves } 1879-1904 +0.98+ .006 
{ Ratio of bank ae to check circulation.. = 1879-1904 +0.53* .095 
{ Ratio of bank reserve to check circulation...... 1880-1904 +0.72* .064 
Relative circulation... 
{ an } 1879-1901 +0.23+0.13 


*Karl Pearson in an article “On the General Theory of Skew Correlation and Non- 
linear Regression” (published by Dulau and Co. in 1905 as one of the Drapers’ Company 
Research Memoirs) discusses the cases in which the law connecting the two series of sta- 
tistics is not linear. He says, “In the great bulk of biometrical and economical enquiries, 
however, the regression does not diverge very markedly from the linear form. In the cases 
of non-linear regression that I have hitherto had to deal with, I find that parabolae of the 
second or third order will suffice as a rule to describe the deviations from linearity” (p. 21). 
Equations of parabolae of the second and third order are of the form 

y =a, +a,x+a,x? 
and y =a, +a,x +a,x?+a,x? 


| 
‘ 
7 
| 


21) The Correlation of Economic Statistics. 307 


The correlation coefficients show that there is a very great 
difference in the degree of correlation of different pairs of series 
of statistics. The full significance of the “probable error,” 
which is used as a measure of unreliability of any determina- 
tion, cannot be developed at this point.* It is sufficient to 
note that, ‘When r is not greater than its probable error we 
have no evidence that there is any correlation, for the ob- 
served phenomena might easily arise from totally unconnected 
causes; but, when r is greater than, say, six times its probable 
error, we may be practically certain that the phenomena are 
not independent of each other, for the chance that the observed 
results would be obtained from unconnected causes is prac- 
tically zero.” 

The high degree of correlation (+0.98) between money in 
circulation inclusive of bank reserves and bank reserves is 
due to the tendency of the two items to vary together during 
the long time period and not due to correspondence of minor 
fluctuations. The reasons for the great increase of money in 
circulation in the United States during the period 1879-1904 
are the great increase of population and the industrial expan- 
sion. Likewise the number of banks increased in order to serve 
the increased population and this meant an increase of total 
reserves. It is self-evident that the long time tendency of the 
two series of statistics must be upward in a growing country. 
It seemed to me that the bank reserves during the 26 years, 
1879-1904, would be as closely correlated with the population 
as with total circulation. The computation of the correlation 
coefficient between bank reserves and population gave +0.98. 
It is the variation upwards of both series during the entire 
period that causes the high coefficient. 

The correlation coefficient between the index numbers of 
business distrust and the rates of bank reserves to check circu- 
lation for the same years is 0.53. When the index numbers of 
business distrust for one year are correlated wit® the ratio 
of bank reserves to check circulation the following year the 
coefficient is 0.72. As Dr. Kemmerer has suggested (but 


“In the computation of the formulas for the probable error it is assumed that errors are 
distributed “normally.” 


tBowley, Elements of Statistics, p. 320. 


308 American Statistical Association. [22 


not verified), there is a closer correlation ‘‘when proper allow- 
ance is made for the time required for alterations in business 
confidence to exert their influence on bank reserves.’”’* The 
lowest correlation (+0.23), that between relative circulation 
and general prices, is not high enough to warrant a conclu- 
sion that the items vary together. The smallness of the cor- 
relation indicated may have resulted either because the quan- 
tity theory is in error or because the statistics are not adequate 
to test the theory. Whatever may be the fact, the statistics 
and the method of measuring correlation presented by Dr. 
Kemmerer do not demonstrate that general prices move in 
sympathy with relative circulation. 

Is the contention of Mr. Cross that “‘the percentage of suc- 
cessful strikes decreases during periods of business prosperity 
and increases during ‘hard times’ ”’ supported by the sta- 
tistics? The following table gives the correlation coefficients 
between pairs of series of statistics, one of the pair in each 
case being the per cent. of establishments in which strikes suc- 
ceeded, and the other series being taken as indicative of busi- 
ness conditions: 


Coefficient of 
Statistics of Period Correlation 
+ Probable Error 
Per ee at establishments in which strikes 
dees 
Index of wholesale prices from Aldrich 1881-1905 —0. 146+ 0.132 
Report and United States Labor Bureauf...... 
r cent. of successful strikes............... 1881-1904 a 
Per i. of successful strikes (year ending 
capita foreiga’ trade’ ‘(ear ‘euding ‘June {| 1881-1005 | —0.178+0.130 
er cent. of successful strikes............... 1881-1904 
Index numbers of business distrustt.......... } 1881-1904 134 


The améunt of correlation indicated in each case is small— 
considering the number of years taken, so small that no con- 
clusion as to the connection between the two series can be 
. *Money and Prices, p. 146. 


tReduced to a continuous series. 
tKemmerer’s figures from Money and Prices, p. 141. 


» 
| 
P 
ry 
4 


23] The Correlation of Economic Statistics. 309 


drawn. The correlation coefficient in the last instance, 7. e., 
between per cent. of successful strikes and business distrust, 
suggests an opposite conclusion to that indicated by the other 
coefficients and that of Mr. Cross. The analysis shows that the 
conclusion that there is negative correlation between general 
prosperity and per cent. of successful strikes is not warranted. 

Finally, what is the degree of corrtlation between the prices 
of British Consols and Sauerbeck’s index numbers of the prices 
of commodities? The chart on p. 297 indicates a greater degree 
of correlation (negative) between the minor fluctuations of the 
two series than shown by any of the pairs of series that we have 
considered. The coefficient of correlation based upon statis- 
tics for the 57 years from 1851 to 1907, inclusive, is —0.58+ 
0.06. A correlation coefficient of —0.58 based upon 57 pairs of 
items warrants the conclusion that the two series have inverse 
movements. 

The relations between the average deviations, x and y, of the 
two series of statistics being considered are:* 


y = —1.465x and x= —0.2295x 
The equations of regression are: 
Y =225.6—1.465 X and X=19.439—0.2295 Y 


For certain pairs of time-series (corresponding items occur 
at same time) of measurements a correlation coefficient ap- 
proximating zero may be obtained even though graphs of the 
statistics show that the up-and-down fluctuations occur 
together. This result will come about if the long-time varia- 
tions show opposite tendencies, as, for instance, in the statis- 
tics of marriages and bank clearings in the United Kingdom. 
On the other hand, a high correlation coefficient may be ob- 
tained for two series having the same long-time tendency 
regardless of the non-correspondence of the short-time fluctua- 
tions. For example, the coefficient for the two series, popula- 
tion and bank reserves, came out to be 0.98. This high coeffi- 


*If a value x, be substituted for the deviation x in the equation y=r ba I x we ought to 
get an approximation to the average value of the deviations of the Y character, all of which 
characters are associated with the deviation x, of the X character. Likewise if y, be sub- 
stituted in x=r a y we ought to get an approximation to the average of the associated x 
deviations. The closeness of the approximation will increase as the form of distribution of 
one series of characters associated with a given value of the other character (called an array) 
&pproaches the normal curve of error. 


310 American Statistical Association. [24 


cient comes from the fact that the long-time variation of both 
series is the same. Consequently, before it is legitimate to 
draw any conclusions as to the meaning of a lack of correlation, 
or amount of correlation between two series of measurements 
it is necessary to ascertain the periodic and the secular varia- 
tions in the two series. This correlation coefficient may be 
large through the correspondence of either secular or periodic 
variation, or both. It may be null because one variation covers 
up the other. 

Three methods have been used for isolating the short-time 
variations of time-series of measurements. They will now be 
considered. 

1. If upon plotting the two series being compared with time 
as abscissa and the measurements as ordinates, periodic varia- 
tions appear at approximately equal intervals of time the curve 
may be “smoothed” and the secular variations may be 


eliminated as follows:* 
(a) Ascertain the length of the wave by finding the number 


of time units between corresponding parts of the waves, 7. ¢., 
crest to crest, or hollow to hollow. Let | represent the number 
of time units found. 

(b) Average groups of | consecutive measurements, placing 
the points, determined by these averages at the middle of each 
group of measurements. Take enough groups so that the 
points obtained will indicate the general tendency of the 
series. 

(c) Draw a smooth curve through the points located by the 
process described in (b). This curve shows the secular ten- 
dency. 

(d) Subtract (this can be done graphically on cross-section 
paper) the ordinates of the “smoothed” curve from those of 
the original curve in order to obtain the series of measurements 
of the periodic fluctuation. Let d stand for any one of these 
differences. 

(e) The coefficients computed for corresponding ordinates 
of two smoothed curves, and for corresponding differences, d 
and d’, give measures of the secular and periodic correlation, 
respectively. 


*Bowley, Elements of Statistics, pp. 176, et seg. 


3 
4 
x 
4 
4 
3 


25] The Correlation of Economic Statistics. 311 


The method described above has been applied by Mr. R. H. 
Hooker in his paper “‘On the Correlation of the Marriage-Rate 
with Trade,”* and by Mr. G. U. Yule in his study of “‘ Changes 
in Marriage and Birth-Rates in England and Wales during the 
Past Half Century.’”t The following table gives the correla- 
tion coefficients computed in the articles named for the 
periodic variations: 


Deviations Content 
from Correlation 


Series Period 


| 
Marriage rate. } | 1861-1895 | 9 yr. means +0.86 


Amount of x clearings per capita } 1876-1895 | 9 yr. means +0.47 


{ Bovcrbeck's indicx of prices, }| 1865-1806 yr.means | +0.795 
Hartley’s pos numbers of unem- 1870-1895 | 1l yr. means —0.873 


The effect of using the deviations rather than the original 
series in computing the coefficient is shown by the comparison 
of the first correlation coefficient of +0.86, given above, with 
the correlation coefficient of +0.18, obtained for the same two 
series of original measurements for the same period, 1861-1895. 

Using the deviation-method, Mr. Yule computed the cor- 
relation coefficients between first, the marriage rate of one year 
(m), and second, exports (e), imports (i), total trade (t), the 
price of wheat (w), and bank clearfmgs (c) for the same year, 
and for each of several preceding years in order to answer the 
question, ‘“‘does the maximum amount of correlation occur 
when corresponding items are of same year or when the mar- 
riage rate of one year is paired with the business item for a 
preceding year?”’ 


*Journal of the Royal Statistical Society, September, 1901, Vol. 60. 
tIbid., Vol. 69, pp. 88-132. 


{ 
| 


312 American Statistical Association. [26 


The following table gives the maximum values found: 


For period 1861-1895 For period 1876-1895 


Business item pre-| Value of ||Business item pre- Value of 
cedes marriage- | Coefficient cedes marriage- Coefficient 
rate item by rate item by 


4 year +0.90 
4 year +0.86 
year +0.92 
O year +0.56 
1 year +0.92 


Having found that the trade cycle affects marriage rate Mr. 
Yule asks the question, ‘‘ Does the cycle affect the birth-rate, 
and how?* The relation between marriage-rate and birth-rate 
is shown by the following table: 


Period | Deviations (a) Correlated (Correlation Coeffi- 
- from with (b) of cient +P. E. 


1 yr. following 0.352+ 0.086 


|| 1850-1896 | 11 year means 2 yrs. following | 0.479+0.076 


b) Birth 3 yrs. following | 0.418+0.082 


Mr. Yule says, “ Fitting a parabola to the three values thus 
determined, a maximum correlation of about 0.482 must sub- 
sist between the birth-rate and the marriage-rate of 2.17 (two 
years and two months) previously.’’f 

Further analysis leads Mr. Yule to the conclusion that 
birth-rate is independently (not through marriage-rate only) 
sensitive to short-time economic changes and that the birth- 
rate is lowered after a depression, not only because of a de- 
crease in the number of marriages during such depression, but 
also to a decrease in fertility. 

2. In case the statistics show a long-time tendency with no 
regular periodic fluctuation Mr. R. H. Hooker has suggested 
that the “differences between successive values of the two 


*Ibid., p. 123. 
tIbid., p. 123. 


. 
Correlated 
rme \% year +0.86 
rmi \% year +0.82 
rmt \% year +0.91 | 
| 
rmw 
| 
rmc 
| 


(27 The Correlation of Economic Statistics. 313 


variables, instead of the differences from the arithmetic 
means’’* be correlated. Put into mathematical symbols we 


have: 

Letting represent two series of 


di,de . . . . d,)represent differences be- 
and tween any two consecu- 
tive measurements, 


din 
and represen the respective means of these differ- 


ences, 
then d,, = , and 
n n 
X’,—X’, Xd’ 
n 
d'n = n 


and the standard deviations of the differences are 
din)? 
n 


= 2(d’—d'n)? 


; 
n 
and the coefficient of correlation is 
_ 2(d—d,,) (d’—d'm) _ 
nod ¥(2d?—nd,,?) (2d’*—nd’,,2) 


Comparing this method of differences with the method 
described in (1) Mr. Hooker says, ‘‘Correlation of the devia- 
tions from an instantaneous average (or trend) may be 
adopted to test the similarity of more or less marked periodic 
influences, correlation of the difference between successive 
values will probably prove most useful where the similarity 
of the shorter rapid changes (with no apparent periodicity) 
are the subject of investigation, or where the normal level of 
one or both series of observations does not remain constant.’ f 
He finds that the ordinary correlation coefficient (r) for the 
price of corn in Iowa and total production in the United States 
for the period 1870-1899 is -0.28, while p =-0.84. 


"Ibid., Vol. 68, p. 697. 
tIbid., p. 703. 


314 American Statistical Association. [28 


I have computed p for the statistics of corn production 
in the United States and the average farm price on December 
1* for the period 1866-1906 and finds p=-0.833+0.034. 
Letting x represent the production difference in millions of 
bushels, and y represent the price difference in cents per 
bushel, the equations of regression are 


y = —0.0256x+1.132 
x= —27.05y+46.42 


A graphic representation of the points whose abscissas and 
- ordinates are the corresponding production and price differ- 
ences, respectively, and the line of regression is given on page 
315. The lack of correlation between the original pair of series 
is shown by the chart on page 316. 
From the equations of regression such statements as the 
following can be made:t 


(i) For no change in corn production there is an increase in 
price of 1.132 cents per bushel. 


(ii) For an increase in production of 100 million bushels the 
price decreases 1.43 cents per bushel. 


(iii) For a decrease in production of 100 million bushels the 
price increases 3.69 cents per bushel. 


(iv) For a stationary price the production must increase 
46 million bushels per year. 


' It seemed to me that if percentage changes in price and 
production were used instead of absolute changes a still 
closer correlation might result. The computation of p from such 
percentages, however, gave —0.794. 


*Statistical Abstract of the United States, 1906, p. 543. 

tThe writer is making similar computations for wheat, oats, rye, barley, buckwheat, pig- 
iron, wool, bituminous coal and anthracite coal. It seems to me that the determination of 
the correlation between money and prices might be carried out by this method. After 
having allowed for the influence of changes in the supply of the various articles on those 
articles the influence of the change in the supply of money upon all the articles might be 
determined. 


é 
4 
He 
/ 
: 


ics. 


von O 


The Correlat 


SNOLLVNOG 


29] 


| 315 
+ 
: 
2 
+ 
> 
8 
3 : 
a 
= : 
> 
8 of 
= 
a 
ee 
J : 
. 
. 
< 
3 


=e 


*uorsseu$aa Jo 


Butpsoooe 

aotad 04 at 

eyey ysnw uotgonpoud 4eq4 


teysng zed squeo 
UT = SUTT 


a409 JO STeYsnq 
= OUTT [Ing 


*906T-998r ‘seotag 


pue 


= 
S 
x 
D 
< 


JO woryonporg as0g 


0082 


otyonpoig 


J 
} 
; 
' 
' 
' 
' ' 
H H 
7 
4 
he 
\ 
AS N 
4 
| z 
225 
u 


31] The Correlation of Economic Statistics. 317 


In the preceding illustrations the amount of correlation 
between the differences was greater than that between the 
original series. The method of differences has also been used 
by the writer for Kemmerer’s statistics (considered on page 
15 of this article) of (1) money in circulation, and (2) bank 
reserves for the period 1879-1904 with the result p= +0.392, 
whereas the value of r is 0.98. This shows that there is a lack 
of correspondence of the short-time variations in these two 
series. 

3. A third method of eliminating the long-time tendency and 
thus isolating the short-time fluctuations is to assume some 
curve, represented by an algebraic equation, which “fits” 
the statistics in question. The first step in the process is to 
select some curve, which, for a priori or other reasons is con- 
sidered the best representation of the ‘growth element.’’* The 
second step is to fit the curve to the statistics; stated algebra- 
ically, to determine the constants in the equation of the curve 
by use of the actual data.t Finally the deviations of the 
original measurements from the smooth curve (called by 
Norton “the growth axis’) are computed. The accuracy 
with which one law, the geometric, y = be’, describes the pop- 
ulation of the United States, and consequently many things 
that depend upon population is shown by the following dia- 
gram. The full points are fixed by the actual population ac- 
cording to each of the censuses from 1850. The smooth line is 
the graph of the equation 


y = 24,086,000 (1.0238)*, 


which equation was determined from the actual population. 


*J.P. Norton gives a table of interpolation forms from Steinhauser on p. 25 of the New 
York Money Market. 


The method of fitting curves to statistics constitutes a separate subject. An extensive 
use of mathematics is necessary in order to develop this subject fully. 


Millions 
75°4 


American Statistical Association. 


y = 24,086,000 (1.0238)" 


Dots denote population according to 
the census. 


318 [32 
° 
» 
50 
4 e 
4 
25 
te 1850 1860 1870 1€80 . 1890 1900 


33] The Correlation of Economic Statistics. 319 


Prof. J. P. Norton has applied the method here described to 
determine the correlation existing between percentage of 
reserves to deposits of New York Associated Banks and call 
rates.* Weekly statistics were taken for the period 1885-1900. 
The growth axes assumed were the geometric curve, y=be’*, 
and the straight line y=a, respectively. (y = the measure- 
ment, x=time measured in weeks, while a, b and ¢ are con- 
stants to be determined from the data.) The typical periodic 
fluctuations of percentage deviations of reserves and loans 
were also correlated by this method, using y = bc* as the growth 
function in both cases. The following table gives the 
correlation coefficients, p 


Series 


Reserve deviations and discount rate ..............00eeeeeeeeee —0.37+0.02 
(a) Reserve and (b) Loan Periods Immediate..................... +0.49+0.07 


(a) precedes (b) by two weeks.............eceeeeceees +0.87+0.02 
(a) precedes (b) by three weeks.................5045- +0.96+0.01 


(a) precedes (b) by four weeks............2+eseeeeees +0.91+0.05 


The conclusion from this study is that “the loan period 
is really the shadow of the reserve period” . . . and 
apparently follows the latter by “an interval of approximately 
three weeks.’’f 

Up to this point the problem before us has been the measure- 
ment of the amount of correlation between two variables. 
This is the simplest case of the general problem of the measure- 
ment of the amount of correlation between one series of meas- 
urements, and a group of any number of series of measure- 
ments. The solution of the general problem leads to very 
complex relations,§ and it will not be taken up here. The 
case of three variables will be considered briefly. 


*J. P. Norton, Statistical Studies in the New York Money Market. 

tIbid., p. 96. 

{bid., p. 94. 

Bay G. U., on the Theory o Correlation, Journal of the Royal Statistical Society 
60, p. 835. 


a 


320 American Statistical Association. [34 


Messrs. R. H. Hooker and G. U. Yule have considered the 
problem, To find the relation between the production of wheat 
in India during the period 1890-1904 (years ending March 
31), the price of wheat (calendar years), and the exports of the 
subsequent twelve months, 1891-1905 (years ending March 
31). The correlation of the annual differences according to 
the method described in (2) of page 312 gives the following 
results: 


Series Correlated Coefficient Correlation 


. Exports and Production +0.77 
. Exports and Price +0.86 
. Production and Price combined in the ratio 1: 1, and Exports.. . +0.90 
. Production and Price combined in the ratio 3: 1, and Exports.. . +0.81 


. Production and Price in the ratio 1:3, and Exports +0.58 


The table indicates that exports depend upon production 
and price, and depend equally upon them. 

Messrs. Hooker and Yule give the following general solution 
of the special problem just considered: 

To find the maximum correlation coefficient between x; 
and x2+bx; that results from considering b a variable, where 
Xi, X2, and x; are the deviations of the series X:, X2, and Xs 
from their respective arithmetic averages. 


Let X2e+ bx3= Z 
then 2(x1z) = bri37173) 
and (422+ be. 2bre3%043) 
T1272 +bris%2 
To find the value of b for which this is a maximum, differ- 
entiate with respect to b and equate to zero; then 
(r13 72 
(Ti2 —Tis'T23)%3 
which gives the maximum value 
Pye? — 
1 
Computing ¥x,z from the data of Indian production, price, 
and exports of wheat the value 0.905 is obtained. 


8 
= 
2 
3 
4 
5 
7 


35] The Correlation of Economic Statistics. 321 
Mr. G. U. Yule, in the paper already referred to,* has 
worked out the general solution of the problem of the correla- 
tion between three variables. In the course of the solution the 
problem just considered is solved incidentally. The argument 
is similar to that used in the case of two variables and so it 
will not be repeated here. A concrete notion of the results se- 
cured by Mr. Yule can be obtained from the following 
explanation taken from Mr. Hooker’s article on the ‘‘Correla- 
tion of the Weather and the Crops.’’f 


“T have in the first place formed the ordinary coefficient 

r= a between the crop and (a) rainfall, (b) accumulated 
temperature above 42°. But rainfall and temperature are 
themselves correlated; hence an apparent influence of, say, 
rainfall upon a crop may really be due to rainfall conditions 
being dependent upon temperature, or vice versa. Hence it 
seemed desirable to calculate the partial or net correlation co- 
efficients, 7. e. (following the notation given in Mr. Yule’s 
paper of 1897). 

Ti3— Ti2'T23 

“This partial coefficient (Pp) may be regarded as a truer in- 
dication of the connection between the crop and each factor 
alone, inasmuch as, speaking approximately, we may say that 
the effect of the other factor is eliminated. It may be ob- 
served, moreover, that the relative influence of rainfall and 


temperature upon the crop is given by oe ; or, more accurately, 
3 


1 
this fraction measures the relative effect of changes equal in 
amount to their respective standard deviations in the rainfall 
and temperature. In discussing the figures in the tables I 
shall accordingly utilize the partial correlation coefficients 
rather than the others. Finally, I have worked out what Mr. 
Yule calls the coefficient of double correlation between the 
crop and rainfall and accumulated temperature above 42°, 
R= / +1913" — 
(1 
or as it may also be written, 
R= V1 — (1 —riz?) (1 — P13”), 
a form which is quicker to calculate. This may be regarded 
*Note on Estimating the Relative Influence of Two Variables upon a Third, Journal of 


the Royal Statistical Society, Vol. 69, pp. 197-200. 
tournal of the Royal Statistical Society, Vol. 70, pp. 5 and 6. 


> 
4 
J 


322 American Statistical Association. [36 


as a measure of the joint influence of the rainfall and the tem- 
perature upon the crop. For the sake of brevity, I shall 
speak of R as measuring the effect of the ‘weather,’ using this 
term in the strictly limited sense of consisting only of these 


two factors. . . . 
“T propose to regard a coefficient between 0.3 and 0.5 as 


suggestive of dependence. Values below 0.3 I shall, as a rule, 
ignore, in the absence of any corroborative evidence. Perhaps 
I may remark that I believe that some statisticians would 
consider themselves justified in drawing deductions from 
lower coefficients than those I have adopted as my limits.”’* 


Mr. Yule notes that the partial or net correlation coefficient 
retains three of the chief properties of the ordinary coefficients: 
‘*(1) it can only be zero if both net regressions are zero; (2) it 
is a symmetrical function of the variables (7. e., Pi2=P2:); 
(3) it cannot be greater than unity.’’t 

The various illustrations which have been cited show the 
importance of questions of correlation in economics. The 
ordinary graphic method of measuring correlation is inade- 
quate. The coefficient of correlation is simple and yet is 
sensitive to small changes. It has been used in many fields 
of statistics by Galton, Pearson, Yule, Hooker, Elderton and 
others. The experience of these writers warrants the adop- 
tion of the coefficient of correlation by economists as one of 


their standard averages. 


*Journal of the Royal Statistical Society, Vol. 70, p. 5. 
tIbid., Vol. 60, p. 833. 


= 
i 


37] Scope and Methods of the Thirteenth Census. 323 


SCOPE AND METHODS OF PRESENTATION OF THE 
RESULTS OF THE THIRTEENTH CENSUS OF 
POPULATION.* 


By W. F. WitLoveusy, Assistant Director of the Census. 


The director of the census, Hon. E. Dana Durand, in two 
addresses delivered before the American Statistical Associa- 
tion, the one entitled “Census Methods’ delivered at the regu- 
lar quarterly meeting of the Association held in Washington, 
September 24, 1909, and the other entitled “Changes in Cen- 
sus Methods for the Census of 1910,” delivered at the annual 
meeting of the Association in New York, December 29, 1909, 
has given a general account of the plans that have been fol- 
lowed in the taking of the Thirteenth Census.t I have been 
requested to supplement these papers by one giving further 
information regarding the plans that have been formulated for 
presenting the results of this census to the public in the pub- 
lished reports. The following paper is an attempt to comply 
with this request. In doing so, the policy adopted by the 
director in his papers of seeking, as his chief object, to point 
out the features in respect to which the methods employed at 
the Thirteenth Census represent departures from prior prac- 
tice, will be followed. | 

The published results of the census are naturally determined 
by the character of the interrogatories contained on the 
schedules upon which the data is collected. That what I 
shall have to say regarding the manner and form in which the 
results will be published may be understood it will be neces- 
sary for me to say a few words regarding the inquiries pro- 
pounded, and thus to possibly cover to a slight extent the 
ground traversed by the director of the census in his papers. 


*Paper read at a regular quarterly meeting of the American Statistical Association, 
Ebbett House, Washington, D. C., November 21, 1910. 


t These papers have been published in the Quarterly Publications of the Association, 
New Series, No. 88, December, 1909; and New Series, No. 89, March, 1910. 


| 


-in the case of married persons, the present marriage represented 


324 American Statistical Association. [38 


If we examine the schedule employed for obtaining the data 


regarding population at the Thirteenth Census, we find that 
the information concerning the population of the country 
on the census day, April 15, 1910, was sought concerning the 
following points: 


(1) Sex; 

(2) Age; 

(3) Color or race; 

(4) Nativity; as determined by (a) place of birth of the person him- 
self, and (b) place of birth of each of the parents; 

(5) Mother tongue, or native language; 

(6) Language actually spoken by those not able to speak English; 

(7) Occupation, and industry in which such occupation was exercised, 
and whether the occupational status was that of employer, 
employee, or working on own account; 

(8) Literacy, as determined by ability to read, and ability to write; 

(9) Prevalence of school attendance as determined by attendance 
at school any time subsequent to September 1, 1909; 

(10) Whether a survivor of the Union army or navy; or of the Confed- 
erate army or navy; 

(11) Whether blind in both eyes; 

(12) Whether deaf and dumb; 

(13) For married persons, the number of years of present marriage and 
whether such marriage was the first, or a second or subsequent 
marriage; 

(14) For married, widowed, or divorced women, the number of children 
that they had borne and the number living on the date of enumer- 
ation; 

(15) For foreign born persons, the year of immigration to the United 
States, and whether naturalized or an alien; 

(16) For employees, whether employed on April 15, 1910, or not, and 
number of weeks unemployed during the calendar year 1909; and 

(17) For persons returned as heads of families, whether they owned or 
rented the houses occupied by them, and if owning whether 
the houses were owned free of mortgage or not. 


Comparing this list with points canvassed in 1900 it will 


be found that the departures consist, partly in covering the 
points concerning which information was sought in a slightly 
different way from that employed in 1900, and partly in ask- 
ing for data not sought at all at that census. The points of 
difference are briefly as follows: 


1. The census of 1900 made no effort to determine whether, 


“us 
a 
7 


39] Scope and Methods of the Thirteenth Census. 325 


the first or subsequent marriage. This information is called 
for by the schedule for 1910. 

The importance of this question lies in the fact that it will 
furnish the data by which to establish the relationship between 
number of children and duration of marriage. This cannot be 
done satisfactorily unless it is possible to segregate and sepa- 
rately consider the number of children and their ages in rela- 
tion to the duration of a single marriage by eliminating cases 
in which the children may have been the offspring of a prior 
marriage. 

2. In 1900 the data was called for relative to the number 
of children born and the number of children living at the 
date of the enumeration the same as at the present census, 
but on account of lack of time, expense involved, and other 
administrative reasons, were never tabulated. It is now 
proposed to tabulate this information for both censuses and 
it is believed that a very valuable body of information con- 
cerning this important subject of fertility, size of family, and 
allied questions will be afforded. 

3. In 1900 but a single column was devoted on the schedule 
to securing a return of the occupations of the people. At the 
present census three columns were employed in order to 
secure, on the one hand, more definite information concerning 
the industry in which each occupation reported was exercised 
and, on the other hand, the occupational status of the person as 
employer, employee, or working on own account. The 
instructions given to the enumerators relative to the manner 
in which these questions should be answered were prepared 
with the greatest possible care, and every effort was made to 
emphasize the desire of the bureau to secure full and accurate 
returns regarding this subject. Indeed, I may state that 
this question of occupation has been, so to speak, featured by 
the bureau in its efforts in respect to the present census. Very 
detailed studies have been made of the problems involved in 
the classification of industries and occupations with a view to 
presenting the results in the manner and form whizh will be 
of the greatest value. It is certain that no like effort has been 
made in the past in the United States to secure accurate in- 
formation regarding this very important subject, and it 


| 


326 American Statistical Association. [40 


is doubtful if any foreign government has gone farther or as 
far in this direction. The technical difficulties in the way of 
securing and presenting statistics of occupations in such a 
form .as is theoretically desirable are very great. Most of 
these difficulties are inherent ones which no amount of care 
can wholly overcome. The results that will be obtained will, 
in some respects, fall short of what is desirable, but I think I 
need have no hesitation in saying that as published they will 
present a body of material of far greater value and accuracy 
than any heretofore given to the public. 

4. In 1900 the inquiry relative to unemployment asked sim- 
ply for the length of time unemployed during the preceding 
year. In the present census an addition to this inquiry is 
made as to whether each person returned as an employee was 
employed or not on the date of the enumeration. The Cen- 
sus Bureau does not lay great stress upon the value of the in- 
formation obtained by it regarding this subject of unemploy- 
ment as returned on the general population schedule. The 
inquiry is made since it constitutes one of the questions 
which Congress has directed should be included among the 
interrogatories on the population schedule. It is believed, 
however, that the question regarding whether unemployed or 
not on the date of the enumeration will furnish information 
more valuable than that obtained at the prior census. It at 
least has the merit of calling for replies that can be made with 
a greater degree of accuracy than can be made to the other 
question concerning this subject which asks for the length 
of time unemployed during the preceding year. 

5. In 1900 the inquiry relative to language spoken simply 
called for a return as to whether the person was able to speak 
English or not. In the Thirteenth Census this inquiry was 
amplified so as to require the return of the specific language 
spoken in the case of all persons not able to speak English. 
It is hardly necessary to state that the information that will 
be obtained in this way regarding the languages spoken by 
that part of the population not speaking English will be of 
interest and value. 

6. In 1900 the inquiry relative to school attendance called 
for a return of the number of months of school attendance 


= 


41] Scope and Methods of the Thirteenth Census. 327 


during the preceding year. At the present census the inquiry 
is restricted to the simple fact as to whether school was 
attended at any time subsequent to September 1, 1909. The 
reasons for making this change lay in the fact that it was 
found that the replies to the question as propounded in 1900 
were not in every respect satisfactory and it was believed 
that the record would be still less satisfactory this time be- 
cause the census was taken in April before the completion of 
the school year, former censuses being taken in June. 

7. The inquiry made in the Thirteenth Census relative to 
whether the person was a survivor of the Union or Confederate 
army or navy is a new inquiry, and its purpose is obvious. 
The instructions to the enumerators regarding the manner in 
which this inquiry should be answered required a separate 
return of the number of persons falling in each of the four 
classes of survivors of: the Union army, the Union navy, the 
Confederate army and the Confederate navy. 

8. I have left to the last the statement of the change which 
was the most important of all that were made. This change 
consists in the requirement of a return of the mother tongue 
of persons born abroad, and of parents who were born abroad 
in the case of persons born in this country but having one 
or both parents born in a foreign country. 

This additional inquiry relative to mother tongue was ex- 
pressly authorized by Congress through an amendment made 
to the Thirteenth Census Act under date of March 24, 1910. 
This amendment provided that the inquiries relative to 
population should call for information ‘‘respecting the nation- 
ality or mother tongue of all persons born in foreign coun- 
tries, and of the nationality or mother tongue of parents of 
foreign birth of persons enumerated.” 

The departure from prior practice thus authorized by 
Congress, in respect to the character of information that 
should be obtained regarding that part of the population 
of the United States that was born abroad, or one or both of 
whose parents were born abroad is of great importance. 
There are few, if any, facts regarding the population of the 
country that are more desired than that concerning the num- 
ber and character of the population of the country of foreign 


| 


328 American Statistical Association. [42 


birth. Prior to the Thirteenth Census the number and 
character of this class of the population could only be presented 
according to the country of birth of the persons enumerated 
and of their parents. Students of social conditions and of 
such sciences as anthropology and ethnology have for years 
pointed out that this data, though of great value, falls short 
of furnishing all the information that it is desirable to have 
regarding the character of that part of our population which 
is of foreign origin. Its defect lies in the fact that, in the case 
of many countries such as Germany, Austria-Hungary, 
Russia, the Balkan States, Switzerland, etc., from which 
many of the immigrants to this country in recent years have 
come, country of birth is no certain indication of the racial 
stock of the persons returned as born in such countries. A 
mere presentation of the number of persons residing in the 
United States, the country of whose birth, or that of their 
parents, was one or the other of these countries, or some other 
country, thus does not furnish a thoroughly adequate idea 
of the racial composition of our population of foreign origin. 
Notwithstanding this it was impracticable to adopt mother 
tongue or racial stock as the primary basis for the classi- 
fication of persons of foreign origin, owing to the fact that 
at all prior censuses the unit of presentation employed has 
been that of country of birth, and comparisons could only be 
had with conditions as shown at those censuses by continuing 
to use that basis. For this reason, and also because the law 
rendered it obligatory, the schedule employed in the Thir- 
teenth Census called for a return of the population of foreign 
origin according to both country of birth and mother tongue. 
It should be stated that the criterion of mother tongue was 
employed as representing the best test that could be applied 
by untrained enumerators for determining racial stock. 

I have devoted considerable time to this subject as it is 
not only one of importance, but, as will be shown hereafter, 
modifies very greatly the character of, and the manner in 
which will be presented, the data relative to population. 

In concluding this survey of the points of difference as re- 
gards the population census between the two enumerations 
of 1900 and 1910, it is of interest to note that in no case has & 


' 
a 
7 
& 


43] Scope and Methods of the Thirteenth Census. 329 
subject covered in 1900, been dropped in 1910, while very 
important additions have been made in the latter over the 
former. The Thirteenth Census of population is thus much 
the most comprehensive census of population that has ever 
been taken by the United States. 

Turning now to a consideration of plans that have been for- 
mulated, or are in éhe process of formulation, for the presen- 
tation of the data relative to population, it is manifest that it 
will not be possible for me, in a paper of this character, to 
give anything approaching a detailed description of the dif- 
ferent tables that will be employed. Such a description 
would be meaningless unless the actual forms of the tables 
themselves were before you for consideration. There are, 
however, certain innovations in past practice which have either 
been definitely determined upon, or which are in contem- 
plation, that are of sufficient importance to warrant my at- 
tempting to indicate them. 

Probably the most important of these consists in the plans 
that have been adopted for the presentation of information 
for the counties individually. Briefly, this consists in bring- 
ing together, in one place, all the data that is given for a par- 
ticular county. There will be a separate table for each 
state which will contain as many columns as there are coun- 
ties in the state; and in addition a column for the state as a 
whole. The data given will be indicated by a succession 
of stubs running down the side of the table. A person inter- 
ested in a given county on account of residing in that 
county, or for any other reason, will thus be able readily to 
refer to the table for the state containing this county and get, 
without further hunting through the census volumes, all the 
facts that are given in the census reports on population regard- 
ing, not only the county in which he is interested, but all the 
other counties of the state, and for the state as a whole, in 
parallel columns, so that he can see the facts regarding such 
county and make such comparisons as he desires with the facts 
as shown for other counties of the state or the state as a whole. 
Appended to this paper is a copy of the form for this table. 


| 


330 


FORM OF TABLE. 


American Statistical Association. 


ALABAMA:—STATISTICS OF POPULATION IN DETAIL, FOR THE STATE 
AND BY COUNTIES, 1910. 


[44 


| The State. Autauen. Baldwin. Barbour. 


Total 

Number in 1800 


Increase: 


Places of 2,500 or more in 1910............. 

Per cent. of increase: 1900-1910....... es 
Remelader of county in 1910............... 
Same territory es 

Per cent. of increase: 1900-1910....... a 
Places of 2,500 or more in 1900............. - 
Remainder of county in 1900...............).. 
Per cent. in places of 2,500 or more: 5 
Per cent. in places of 2,500 or more: 


Land area (square miles)................... 
Persons per square mile.................... 


AND NATIVITY. 


Number in 1900 


Native white—one parent foreign born...... 

Native white—both parents foreign born .... 
Foreign-born 


Per cent. of total population. 

Native white—both parents native.......... 

Native white—one or both par. for. born.... 


COUNTRY OF ORIGIN. 

Foreign-born white: Born in— 


All other (see Table 
Native white—both parents native.......... | 


Native white—one or both par. for. born....|.......... 
N in 1900 


eee 


POPULATION. | | 
Per cent. of increase ___..... 
| 
| | | 
“ee ee eee 
] 
| 
| 
oreign-t 
Per 
Negro... F 
Per A 
All other U 
| 


Scope and Methods of the Thirteenth Census. 331 


FORM OF TABLE—Continued. 


Subject. 


Autauga. Baldwin. | Barbour. 


Foreign-born white: Born in— 


Native white: Parents born in— 
Austria 
Canada—English 
Canada—French 
Denmark 
England 


MALES OF VOTING AGE. 


Citizenship of foreign-born white. 
Naturalized 


Ilite 
iterate males of voting age. 
Total, number illiterate 

Per cent. illiterate 


Per cent. in 1900........ 


Native white, number illiterate 
Fer cent. iliiterate 
rn white, number illiterate 
cent. illiterate 


| 
| 
| 
| | 
SEX. 
| | | | 
Native white—one or both par. for. born... 
Native white—one parent foreign born ......|........-c)ecceccculecccccccs|ececceuce 
Native white—both parents foreign born ... 
| 
“ee ee eee “eee ee “eee ee 


332 American Statistical Association. 
FORM OF TABLE—Concluded. 


[46 


Subject. The State. Autauga. 


Negro, number illiterate. 
Per cent. illiterate 


Persons 10 years old and over. 
Per cent. illiterate 


Persons 10 to 20 years. 


SCHOOL AGE AND ATTENDANCE. | 

Number attending school.............. 

Per cent. attending school ..... 


Baldwin. Barbour, 


Per cent. attending school .. ey eee 

Number 15 to 17 years........ 
Number attending school. . 

Per cent. attending school............. 


Persons 6 to 14 years. 
Per cent. attending school ............. SEA, 
Per cent. attending school 
Persons 15 to 20 years. 
Number attending school.............. 

Per cent. attending school ........ 
Number attending school.............. 
Per cent. attending school ........... me 
umber attending school.............. 
Per cent. attending school ...... 


egro 
Number attending school....... paige 
Per cent. attending school............. 


Note: This list of countries given under “Country of Origin” is not necessarily the 
one that will be used, as a selection will be made in the case of the table for each = 


of those countries which have the 


representation in that state. In the case of 


ring out 


tables for the Southern states th ta gi der illite will be such as to b 
or the e e data given under illiteracy suc ae 


the differences between the white and ne; 


of between native white 
born white as shown in this form for the tes. 


: 
Per cent. illiterate 
ee 
¥ | 
q 
Ss 
— 
2 
0 
la 


47] Scope and Methods of the Thirteenth Census. 333 


This represents a radical departure from the method of 
publishing county figures ten years ago. At the census of 
1900 the county data were presented “topically” ; that is, there 
was a separate table for each subject concerning which informa- 
tion for counties was given, as, for example, sex, color, nativity, 
illiteracy, and the like. It was thus necessary, in using the 
census reports for 1900, for a person who desired to obtain a 
general picture regarding conditions in a particular county, 
as brought out by the population inquiries, to refer to a large 
number of tables; and these tables did not in all cases follow 
one after the other, but were scattered more or less through 
the two volumes containing the population data. 

All persons who have had this change brought to their 
attention believe that the innovation will constitute a very 
marked improvement over prior practice. Certainly it will 
mean that census statistics of population will appeal to, and 
be utilized by, a far greater number of persons than ever 
before. 

A similar set of tables, one for each state, will present the 
data that is given separately for cities, villages, etc., having 
a population of 2,500 or more. This, like the county tables 
just described, is an innovation over the 1900 census. 

Before leaving these two sets of tables it should be noted 
that they give the statistics covered by the tables, not only 
for each county individually, and for each city or town having 
as small.a population as 2,500, but that they also give, as a nec- 
essary part of their scheme, the same statistics for the state 
as a whole and for each city no matter how great may be its 
population. These two sets of tables together, therefore, 
are all-comprehensive, in the sense of giving statistical data 
according to the largest as well as the smallest geographic unit 
of presentation. 

It is evident that in the case of such small geographic units 
of presentation as the county and a city having as small a 
population as 2,500, limitations of space, were there no other 
considerations, would prevent the entering into all the refine- 
ments of the statistical tabulation that it is desirable to make in 
order fully to bring out the important points concerning popu- 
lation conditions. In point of fact, however, little would be 


| 


.together as one of the volumes of the final report. 


334 American Statistical Association. [48 


gained by entering into such detail for such small geographic 
units, even did these limitations of space not prevent. In 
order to draw deductions of value regarding the characteris- 
tics of population such, for example, as sex, conjugal condi- 
tion, size of family and the like, it is necessary that the groups 
to be dealt with should represent numbers of considerable 
magnitude. Generally speaking, therefore, the smallest 
geographic unit that will be employed, in order to present 
statistics topically, will be the state and the city having 
a population of 25,000 or over. In dealing with these, and the 
larger geographic units, the main divisions of the country, 
the country as a whole, and the rural and urban sections as 
a twofold geographic division, the center of interest will shift 
from the geographic unit to the topical unit. The general 
scheme for these tables will thus be the reverse of that for the 
two series of tables for counties and for cities having a popu- 
lation of 2,500 or over, in that the topic will be the main unit 
and the geographic divisions the subsiduary ones running 
down the side of the page as stubs, instead of the geographic 
division being the primary unit and the data or topics the sub- 
siduary ones. There will thus be a series of tables for each 
topic, such as sex, age, nativity, conjugal condition, literacy 
and the like, so constructed that the information regarding 
the topic is presented for the country as a whole, the main 
divisions of the country, the state and the cities having a 
population of 25,000 and over, separately, in order that the 
facts relative to the topic can be seen for each of these main 
geographic divisions in comparison one with the other. 
Notwithstanding the fact that, generally speaking, where the 
information to be imparted is in regard to a certain topic or 
subject such as conjugal condition, literacy, etc., such topic 
should be made the main unit of presentation and the geogra- 
phic unit the subordinate one; it is, nevertheless, extremely 
desirable that the information should be given in such a way 
that a person interested in a particular state can find the in- 
formation, in the special bulletin that will be issued for such 
state, or in the corresponding chapter of the final report 
devoted to such state, since the state bulletins will be bound 


3 
( 
I 
| 
t 
b 
4 a 
se 
fc 
: al 
A te 
tc 
tk 
eg 
fo 
st, 
3 of 
ge 
es! 
: int 
un 
an 
fig 
re] 
of 
ci 
hat 
cer 


49] Scope and Methods of the Thirteenth Census. 335 


In 1900 the data given in the topical tables were arranged so 
as to give the detailed facts regarding the main classes of the 
population, as, for example, native born, foreign born, native 
whites with native parents, native whites with foreign pa- 
rents, negro, etc., in separate tables for each class, so that, 
in these tables, all of the data relative to a particular state 
did not appear in one place. This was unfortunate for two 
reasons: (1) because it did not permit of a comparative study 
of the data regarding the different main classes of the popu- 
lation within a state to be easily made and (2) because it 
was impossible to lift the part of each table that related to a 
given state and reproduce it without change in the state bul- 
letins. The only way by which the facts shown in these 
topical tables could have been also shown in the separate state 
bulletins would have been by constructing entirely new tables 
and going to the great additional expense involved in the type- 
setting necessary to put these tables into print. In the report 
for 1910 both of these objections will be met by having, in 
all, or practically all, the topical tables, all the facts relative 
to an individual table appear in immediate juxtaposition 
to each other as a section of the table, so that, not only can 
these different facts, as regards an individual state, be more 
easily studied in relationship to each other, but the section 
for each state can be reproduced in the bulletin for such 
states without any additional work in the way of preparation 
of new tables or new type-setting or composition work. 

This is but one illustration of the effort that is being made 
generally so to present the census statistics that persons inter- 
ested in conditions in a given county or state, as well as those 
interested in a given topic, can readily find, in one place, 
under the proper geographic designation, the data he seeks, 
and will not be compelled, so to speak, to compile it from 
figures appearing in different tables scattered throughout the 
report. 

There exists, of course, a wide opportunity for the exercise 
of discretion in determining precisely the data that will be 
given in the particular tables, and the tables in 1910 will 
naturally present differences from those for 1900 and preceding 
censuses. It is manifestly not feasible for me in this paper 


336 American Statistical Association. [50 


to attempt to point out many of these differences, although 
some of them are of considerable significance. There is one 
difference, however, which, more or less, will run through 
all of the tables. This is the great extent to which there will 
be inserted in the tables, in immediate juxtaposition to the 
absolute data for 1910, figures giving the corresponding data 
for previous censuses, and showing percentage distribution, 
increase, or relation in order that the significance of the facts 
presented by the Thirteenth Census figures may be more 
fully brought out in the main tables themselves, instead of 
leaving this work of interpretation to the text analysis or to 
special calculations by persons using the census reports. 

This practice will facilitate enormously the practical use 
of census statistics. An illustration of just how great a gain 
will result may be seen by contrasting the character of data 
that was given in 1900 regarding illiteracy with what it is 
proposed to give for 1910. In 1900 the number of persons 
ten years of age and over who were illiterate was given in 
great detail according to main classes of the population, such 
as native born, foreign born, native white of native parents, 
native white of foreign parents, negro, etc. In the tables 
giving these facts, however, there was not given the total 
number of persons ten years of age and over with which to 
contrast the number who were illiterate, nor percentages 
showing the extent of illiteracy. Indeed, in many cases, the 
total number for each class was not given anywhere in the 
report or appeared in such a way that it could only be ob- 
tained by the addition of subgroups. The report for 1910 
will follow the general practice of giving the total numbers 
in each class with which to contrast the numbers illiterate 
and the percentages by means of which the contrast is brought 
out. This practice, as stated, will be employed generally 
throughout the report. 

Another consideration that will probably be more empha- 
sized than ever before will be that of segregating and group- 
ing all tables relative to a particular topic in one place in the 
report, and in having the tables and the text describing and 
commenting upon the tables appear together. Most persons 
making use of the census reports do so for the purpose of 


I 
a 
fi 
t 
ti 
te 
t] 
h 
0 
It 
a 
n 
ir 
ai 
ID 


51] Scope and Methods of the Thirteenth Census. 337 


obtaining information either concerning some one locality, 
or concerning some one topic or number of related topics, 
in which they are interested. It is believed, therefore, that 
this more careful grouping of the census data according to 
subject-matter taken in connection with the comprehensive 
character given to the county tables and state bulletins will 
facilitate greatly the use of the census volumes by the student 
and general public. Any charge that will aid the public 
in making use of the census volumes will be a great gain. 

It is almost certain, also, that the detailed data regarding 
population such as sex, age, literacy and the like, will be 
presented for each country of birth, in the case of the foreign 
population, to a much greater extent than in any former 
census. Heretofore most of these facts were given simply 
for the total foreign-born population as a class. 

These data showing the characteristics of the population 
will also be largely given for what may be called the second 
generation of the population coming from each foreign country; 
that is, those born in the United States but whose parents 
were immigrants born in that foreign country. Practically 
nothing was done in 1900 in this direction. For the first 
time, therefore, it will be possible to determine the charac- 
teristics of this class of the population separately for each 
country of origin and to contrast such characteristics with 
those shown for the first generation of immigrants on the one 
hand, and with the native-born of native parents, on the 
other. 

Another important departure from the methods employed 
in giving the statistics for 1900 consists in the much greater 
attention that will be paid to the presentation of figures so 
as to contrast conditions in urban and rural communities. 
This distinction was made to some extent in 1900, but by 
no means generally throughout the report. It is proposed 
in the report for 1910 not only to have this distinction appear 
much more generally than in 1900, but a more detailed analy- 
sis of urban figures will be shown by giving such figures 
Mm Many cases according to the five subclasses of cities of a 
population of from 2,500 to 10,000, 10,000 to 25,000, 25,000 
to 100,000, 100,000 to 500,000 and 500,000 and over. 


‘ 


338 American Statistical Association. [52 


The only other case in which I think the attempt should 
be made by me in this paper to comment upon the specific 
character of the data that will be included in the tables has 
to do with the matter of the presentation of the population 
statistics according to the country of birth and mother tongue. 
The reasons for including upon the population schedule the 
inquiry relative to mother tongue have already been given. 
The securing of this information furnishes two bases by which 
the data concerning our foreign population, or our popula- 
tion descended from foreign-born parents, may be presented; 
that is, country of birth or mother tongue; the latter, as al- 
ready explained, representing racial stock. But for the mat- 
ter of space required, it would be desirable to present all the 
statistics according to both bases: according to country of 
birth, because having a value in itself, and in order to permit 
of comparisons with preceding censuses; and according to 
mother tongue, because it is important to determine the 
characteristics and the attributes of the foreign-born popu- 
lation according to their racial stock as well as according to 
the particular countries from which they or their parents 
may have come. Not only is it theoretically desirable to 
present the figures according to both of these bases, but it 
is also theoretically desirable to present them according to 
the two bases in combination; that is, in such a way as to 
show under each country of birth, as, for example, Austria, 
the figures regarding each racial stock represented by persons 
coming from that country and, vice versa, for each racial 
stock the countries of birth of the persons going to make up 
the numbers returned as belonging to that stock. It is com- 
paratively a simple matter to give this information according 
to the two bases in combination, in so far as the mere number 
of persons represented is concerned, since a simple table, 
giving the countries of birth at the heads of the columns and 
the mother tongues in the stubs at the side of the page, will 
present this information. A showing of this character will 
consequently be made. When we turn, however, to the mat- 
ter of giving the figures showing the characteristics and at- 
tributes of the persons enumerated it is impracticable to make 
such showing separately for both of the two bases in com- 


4 
4 
| 
| 
j 
| 
| 
| 
| 
| 
| | 
| 
4 
a 


53] Scope and Methods of the Thirteenth Census. 339 


bination. The problem that confronts the Census Bureau, 
therefore, is that of deciding whether such details will be given 
by one or the other of these two bases, or by both. No 
definite conclusion has been reached on this point, although 
it is probable that, on account of the space that would be 
required, the showing will be made by only one of the bases, 
and that, in order that comparison may be made with previous 
census data, the basis selected will be that of the country of 
birth. In doing so, however, the facts regarding racial stock 
will be brought out to a considerable extent by distinguishing, 
in the case of the three countries having the greatest mixture 
of racial stocks,—Germany, Austria and Russia,—between 
the different racial stocks coming from such countries by mak- 
ing them subheads under the totals for the countries them- 
selves. Furthermore, it is of interest to note that there will 
also be available in the Census Bureau the data necessary 
to make a complete showing according to the basis of mother 
tongue or racial stock if at any time hereafter it is deemed 
that the value of such a showing is sufficient to warrant the 
expense involved. 

Before leaving this matter of population mention should be 
made of the fact that the bureau contemplates collecting to- 
gether in one place, and of publishing as a separate bulletin, 
which will subsequently constitute a chapter of the final 
report, all of the figures that are given throughout the entire 
report for the United States as a whole. This will mean 
simply the reproduction in one place of the figures consti- 
tuting the totals for the United States as a whole in all the 
series of tables embraced within the report. The bringing 
of these figures together, and segregating them from the figures 
relative to smaller geographic units of presentation will, it 
is believed, be of great value and interest to the large class 
of persons in this country and in foreign lands, who are in- 
terested in knowing the grand results of the enumeration of 
the population in 1910 in comparison with the results shown 
at previous censuses. The having of these figures in a com- 
paratively small bulletin will also tend to economy in the 
distribution of the report, since, in many cases, this bulletin 
can be sent in answer t@ inquiries for information, instead of 


1 
| 
| 


340 American Statistical Association. [54 


having to send the bulky and more expensive general reports. 
Somewhat similar bulletins will be issued for the individual 
states as rapidly as the data is compiled in the same way 
as was done in 1900. 

This completes what I have to say regarding the topic 
assigned to me. It will, I am sure, be appreciated that all 
that I have been able to do has been to select for comment a 
few of the many points presented in determining upon the 
manner and form in which the census figures should be pub- 
lished. In making this selection, however, I have sought 
to choose those which best illustrate the fundamental. aims 
that the bureau has kept constantly before it, that namely of 
so constructing and so arranging the tables as regards the 
order of their presentation, that the effective use of the 
information given will be facilitated to the maximum extent 
possible and the reports as a whole appeal to the largest 
possible body of inquirers. 


3 
7 
3 
7 
| 
fe 
a 
. 
‘ 
/ 
: 


55] Infant Mortality. 341 


*4 STATISTICAL SURVEY OF INFANT MORTALITY’S 
URGENT CALL FOR ACTION. 


By Epwarp BuNNELL PHELPs. 


At the very outset, it should be clearly understood that 
all authorities on the subject long since concurred in re- 
stricting the application of the term Infant Mortality to 
deaths under one year of age, thus indirectly relegating to 
the class of Child Mortality all deaths of children between 
one and, say, five years. Consequently, all figures and 
statements in this paper dealing with Infant Mortality apply 
only to deaths under one year of age. The world’s spe- 
cialists on vital statistics have also tacitly agreed, for good 
and sufficient reasons which need not here be discussed, 
that the rate of Infant Mortality shall be calculated by the 
division of the number of deaths under one year by the num- 
ber of births per annum—stillbirths excluded—instead of 
by the division of the number of deaths under one year by 
the living population under one year, as would be the case 
were the time-honored method of computing the death- 
rates at all other ages applied to Infant Mortality. The In- 
fant Mortality ratio being worked out on a basis positively 
unique, it is, therefore, obvious that it cannot properly be 
compared, or contrasted, with the commonly-accepted death- 
rates for any other ages. 

That the problem is world-wide in scope is conclusively 
proven by a tabulation of the Infant Mortality of thirty-one 
of the principal foreign countries for the quarter-century 
ending with 1905, compiled and published about two years 
ago (see Table I). Summarized in a single sentence, this 
table shows that on a broad average for twenty of the prin- 
cipal countries of Europe no less than 162 out of every 1,000 
babies born alive died before completing the first twelve 
months, in the twenty-five years ending with 1905, and that 


* Read before the American Association for Study and Prevention of Infant Mortality 
at the annual meeting in Baltimore, November, 1910. 


i 


342 American Statistical Association. 


[56 


in that same period the average ratio of deaths under one 

year to living births in thirty-one of the leading countries 
of the world, even including the seven divisions of Aus- 
tralasia with their exceptionally low rates of infant deaths, 
was 154. For only about one half of these countries are 
the Infant Mortality rates for the three years, 1906-08, now 
obtainable, and the sixteen countries in question had an 
average infant death-rate of 133 in 1906-08, as contrasted 
with one of 142 in 1901-05, and one of 150 for the twenty- 
five-year period ending with 1905. 

Even were these records to be taken as absolutely worthy 
of credence, the apparent decrease in the three years ending 
with 1908 would indicate a decrease of less than two infant 
deaths per hundred living births in recent years as compared 
with the average for the quarter-century immediately pre- 
ceding. But the figures in question can scarcely be taken 
at their full face value, although cited from the official vital 
statistics of the several countries, for the reason that in prac- 
tically all countries there has been a greater improvement 
of late years in the registration of births than in that of in- 
fant deaths. And, as the infant death-rate is calculated by 
dividing the number of deaths under one by the number of 
registered births, the larger the registered percentage of 
births the larger will be the divisor, and the smaller will be 
the quotient—or apparent Infant Mortality rate—even 
though the actual numbers of births and infant deaths be 
identical for the two periods under comparison. It is evi- 
dent, therefore, that apparent declines in the annual infant 
death-rates of the various countries, states and cities cannot 
be taken as conclusive, and should not so be taken unless 
carefully investigated and supported by corroborative data. 
This defect in the commonly-accepted statistics of Infant 
Mortality is especially prevalent in those for the registra- 
tion states and cities of this country, as I shall later on en- 
deavor to make clear. 

For the time, it seems to be safe to assume that in the 
civilized world at large, outside of the United States, not less 
than thirteen out of every 100 babies born alive die within the 
first year. In some countries the infant death-rate is nearly, 


i 
4 
¥ 


57] Infant Mortality. 343 


if not quite, twice as high as that figure, but, dealing only 
with the broadest averages, the world’s Infant Mortality 
now unquestionably amounts to thirteen deaths for every 
100 living births. As the appended tabulations for this 
country make clear, the Infant Mortality rate for the United 
States is certainly no better than that for the rest of the 
world at large. 

What is, or has been, the Infant Mortality of the United 
States as a whole? Nobody knows; and there is no means 
of finding out. The eighteen states and fifty-four cities in 
other states whose registration systems in 1909 were acceptable 
to the Bureau of the Census, include but 55.3 per cent. of 
the total estimated population of Continental United States 
for which returns for last year are presented in the Census 
Office’s recent advance bulletin of Mortality Statistics for 
1909, and but a single one of all the Southern states, namely, 
Maryland, figures in those statistics. Not only are mortality 
statistics for nearly one half of the population of Continental 
United States therefore unavailable, but as our national 
statistics for the registration area as yet present no birth 
returns—except for Census-taking years—it is a physical 
impossibility to compute the ratio of deaths under one year 
to living births, on the one universally-recognized basis, 
even for our registration area. To be sure, the majority of 
the registration states do issue annual compilations of their 
respective vital statistics, and most, if not all, of these com- 
pilations now include tabulations of living births and deaths 
at the various ages. Even the oldest and most reliable of 
these state systems of registration of births and deaths, that 
of Massachusetts, has been unable to round out complete 
annual returns of births, however, and in the Sixty-Seventh 
Report of Births, Marriages and Deaths in Massachusetts, 
for the year 1908, a frank admission and partial explanation 
of this fact are made in these words (p. 142): 


“Although the law applies to the registration of births, 
as well.as to that of marriages and deaths, it is probable that 
the statistics of the births are less accurate than those of 
either of the other two classes. From the nature of things, 
Marriages and deaths must be registered, in order that the 
former may be solemnized, or that interment be possible in 


' 
| 


344 American Statistical Association. [58 


case of deaths; but in the case of the births, the inadequacy 
of penalty for neglect, ignorance of the law, as well as topo- 
graphical conditions, tend to an incomplete registration. It 
is therefore likely that the number of births returned in 
Massachusetts in 1908 was less than the actual number which 
occurred; hence, a lower birth-rate, and comparisons between 
births and deaths inaccurate.”’ 


In many, if not most, of the other states which purport 
to present annual birth statistics, the registration of births 
is far more defective than in Massachusetts, and as an in- 
evitable result of the incomplete returns for births the Infant 
Mortality rates—or ratios of deaths under age one to living 
births—presented in the annual reports of these states can only 
be taken in a Pickwickian sense, so to speak. The divisor 
being too small in every case—in some cases materially 
under the proper figure—of course the resultant, and ap- 
parent Infant Mortality rate, is above the actual rate. As 
the years roll by, the birth registration is doubtless improving 
in most cases, the margin of error is, therefore, continuously 
changing, and hence attempted comparisons of the apparent 
Infant Mortality rates of recent years with those of earlier 
years are more or less misleading. 

Although no birth returns have been included in the Cen- 
sus Office’s annual publications of Mortality Statistics for the 
last ten years, in all of these reports there has been a classifica- 
tion of deaths by ages in the constantly-changing registration 
area, and some idea of the movement of the Infant Mortality 
rate in so far as registration states and cities have been con- 
cerned may be had by a comparison of the annual ratios 
of deaths under one year with the total number of deaths at 


’ all ages in each of those years. Sometime ago I worked out 


such a comparison for the nine years ending with 1908, and 
now note in the advance bulletin of Mortality Statistics for 
1909 that Doctor Wilbur suggests such a basis of comparison, 
and furnishes several tabulations on those lines which are 
of real value in any study of the Infant Mortality of this 
country. As he puts it (p. 11, Bulletin 108): 


“When the proper statement of infant mortality is lacking, 
recourse may be had to the ratio between the number of deaths 
of infants under one year of age and the population under 


‘ 
‘ 
] 
& 


59] Infant Mortality. 345 


one year, although this ratio is unsatisfactory for many 
reasons, and the population under one year is not available 
except by estimation for intercensal. years. A very crude 
means of judging of the condition as regards the general ex- 
tent of infant and child mortality is to compare the total 
number of deaths of infants under one year and of children 
under five years of age with the total n r of deaths reg- 
istered. Other things being equal to say, with 
substantially similar populations with Yespect to age dis- 
tribution and in the absence of epidemic diseases prevailing 
at higher age periods—the relative proportions of deaths of 
infants and children to the total number of deaths should 
show approximately the prevalence of infantile diseases and 
the importance of reducing the general mortality by efforts 
directed toward the prevention of infant mortality.” 


This means, in other words, that, in default of national 
figures for either births or living population under age one 
in the registration area, at least one available means of at- 
tempting to measure the Infant Mortality for that area is 
a comparison of the annual ratios of deaths under age one 
with the total number of deaths at all ages in that area. 
Such a comparison for the ten years 1900-1909 is presented 
in Table II. The table shows that in the quinquennial 
period, 1900-1904, the deaths under 1 year in the registration 
area amounted to 19.2 per cent. of the total deaths at all ages 
in that area; while in the quinquennial period, 1905-1909, 
the ratio had risen to 19.5 per cent., thus showing an ap- 
parent increase rather than decrease. But as Doctor Wilbur 
says this method of comparison is ‘‘a very crude means of 
judging of the condition,” and its credibility depends upon 
the assumption that age distribution and population condi- 
tions were substantially similar during the periods of com- 
parison. But practically no other means of even attempting 
to measure the rise or fall of the Infant Mortality rate for 
the registration area as a whole is possible, and there are 
some reasons for believing that, in the main, the ratio of 
infant deaths to deaths at all ages affords a fairly reliable in- 
dex of the Infant Mortality situation under normal general 
conditions. In the case of the comparison of the ratios of 
1900-1904 and 1905-1909, the differing conditions must be 
noted. 


346 American Statistical Association. [60 


From 1900 to 1905, inclusive, the registration area re- 
mained practically unchanged, no additions of area being 
made, whereas in 1906 the states of California, Colorado, 
Maryland, Pennsylvania and South Dakota were added to 
the registration area, thus increasing the population of that 
area by more than #,000,000, or more than 22 per cent. The 
addition of these Ge states materially increased the urban 
population of the registration area, and as the infant death- 
rate in the cities is, in general, considerably larger than that 
of the rural districts this radical change in the make-up of 
the registration area might confidently be expected to send 
up the infant death-rate of the area in question. But from 
1900 to 1905, inclusive, the mortality statistics of the registra- 
tion area dealt with precisely the same territory, hence are 
fairly comparable, and the registration returns for infant 
and child mortality for that period, as presented in Table II, 
are worthy of careful study. In the last of the six years in 
question, 1905, the deaths under one year were fewer by more 
than 6,000 than those in the first year of the period, 1900, 
the deaths between one and five years showed an even larger 
decrease, one of more than 10,500, and, of course, the total 
of deaths under age five had dropped to the extent of more 
than 16,500, the sum of the decreases in the previously-named 
age-groups. Assuming that the birth-rate for the six years 
was substantially uniform (the birth-rate in the census 
year 1900 was 27.2 per 1,000 of mean population in the 
United States), the natural growth of the population in the 
registration area between 1900 and 1905 (amounting to nearly 
3,000,000) would indicate an increase of about 81,000 in the 
probable number of births in 1905 as compared with 1900, 
and at the ratio of deaths under age one to living births in 
the registration area in the census year 1900 (149.4 per 
1,000), this increase in births would have involved an increase 
of more than 12,000 in the number of infant deaths in 1905 
as contrasted with the number in the calendar year 1900. 
As a matter of fact, there was a decrease of more than 6,000 
in the number of infant deaths, instead of the presumable 
increase of more than 12,000, and that figure would seem to 
signify an apparent decline of more than 23 per cent. in the 
infant death-rate of 1905 for the registration area as compared 


os 
i 
4 


61) Infant Mortality. 347 


with that for the census year 1900 as shown by the Twelfth 
Census. Of course comparisons of single year’s mortality are 
open to many serious objections, but as the number of infant 
deaths in the registration area in 1905 was higher by several 
thousands than that of any of the years intervening between 
1900 and 1905, the decrease in the actual number of deaths 
under age one, in the face of a steadily-increasing population 
and corresponding increase in the number of births, would 
seem conclusively to indicate at least a slight decrease in the 
infant death-rate. The unquestionable decrease in the 
general death-rate, from 1,755.0 per 100,000 in 1900 to 1,501.8 
in 1909 in the registration area, the similar decrease in the 
general death-rates of foreign countries, the comparatively 
slight but almost invariable decline in the infant death-rate 
in recent years in those states and foreign countries having 
reasonably complete registration systems, and all collateral 
evidence combine to suggest a small decrease in Infant Mor- 
tality throughout the United States in the last decade. But 
positive evidence of that presumable decrease will not be 
forthcoming until the Infant Mortality statistics of the 
Thirteenth Census are available. In any event, it is ex- 
tremely improbable that the infant death-rate for the registra- 
tion area, which was 149.4 per 1,000 living births in the 
census year 1900, will prove to have dropped below 130 per 
1,000 in the census year 1910. 

Much more convincing evidence of the hoped-for decrease 
in the annual waste of infant life in this country is afforded 
by the ten-year study of the registered*living births, deaths 
under age one, and deaths at all ages, and their respective 
ratios, in certain states having well-established registration 
systems, which I present in Table III. Through the codpera- 
tion of the registration officials of Massachusetts, Connecti- 
cut and New York I have been enabled to obtain the figures 
for 1909 in advance of the publication of their several annual 
reports, and in the case of Massachusetts and Connecticut 
have thus been able to tabulate comparisons for the last ten 
years; in the case of New York state, the comparison was 
restricted to six years, as the New York State Department of 
Health did not separately classify deaths under one year 
Prior to 1904. 


348 American Statistical Association. [62 


The general significance of this table may be summarized 
in the statement that in both Connecticut and Massachu- 
setts the number of deaths under one year was smaller in 
1909 than in 1900, despite the decided increase in population 
and living births in each case, and the ratio of infant deaths 
to living births, of course, shows a marked decrease. In 
Connecticut there has been an apparent decrease of no less 
than forty infant deaths per 1,000 living births in the last 
ten years, and in Massachusetts there has been an apparent 
decrease of 29.5 in the same period. In the last six years 
the nominal ratio of infant deaths to living births in New 
York state has decreased 21.4, and in the face of the con- 
tinuous increase in the population of the Empire State the 
number of infant deaths recorded for the entire state has 
increased only about 1,100 in 1909 as compared with 1904. 
Lest these large apparent declines in the infant death-rates 
of these three representative states may be taken more lit- 
erally than the facts warrant, I must again call attention to 
the unquestioned increase of late years in the percentage of 
registered births, thanks to the vigorous efforts of the Divis- 
ion of Vital Statistics of the Bureau of the Census, and various 
other helpful agencies, and remind you that the actual 
decrease in Infant Mortality is therefore considerably smaller 
than the figures would seem to indicate if taken at their 
face value. 

For instance, that excellent authority of almost life-long 
experience with vital statistics, Dr. William H. Guilfoy, 
registrar of records ‘6® the Department of Health of the City 
of New York, tells me that although from 92 to 95 per cent. 
of the actual births in the City of New York were probably 
registered in 1909, some authorities estimate that not more 
than 66 per cent. of the births in that same city were registered 
as late as 1900. Doctor Guilfoy believes that the percentage 
of registered births in 1900 was something like 80 per cent., 
but, even on that basis, there would have been a net increase 
of at least 15 per cent. in registered births in the City of New 
York in the last ten years, and in many other cities and dis- 
tricts the increase in the percentage of registered births has 
doubtless been very much larger. Consequently, the actual 
decline in Infant Mortality in the registration states of this 


= 
7 
3 


63] Infant Mortality. 349 


country unquestionably is much smaller than the apparent 
decline, and in at least some foreign countries the same ex- 
ception must be noted and taken into account in comparing 
their official Infant Mortality figures. . 

By far the largest percentage of deaths under age one due 
to any one class of causes is that of infant deaths caused by 
diseases of the digestive system, which in 1909, for illustra- 
tion, amounted to 29.5 per cent. That is to say, almost one 
third of all the deaths under age one in the registration area 
of the United States in the last year were due to that one 
class of disease. Deaths due to diseases of early infancy 
ranked second in numerical importance, footing up 23.9 per 
cent., or nearly one quarter of the dread total, and diseases 
of the respiratory system accounted for 16.5 per cent. of the 
infant deaths in the registration area. All told, these three 
classes of causes carried off 69.9 per cent. of all the babies 
under one year of age who died in that area in 1909. 

As that eminent authority on the diseases of children, 
Dr. L. Emmett Holt, put it in his address on “‘ Infant Mortal- 
ity and Its Reduction, Especially in New York City,” before 
the Section on Diseases of Children of the American Medica! 
Association, in June, 1909: ‘‘The fundamental causes of 
Infant Mortality, as we may call them, are mainly the re- 
sult of three conditions—poverty, ignorance and neglect. 

The curve of diarrheal diseases is so important that it prac- 
tically controls the curve of Infant Mortality. This group 
embraces acute gastritis, gastro-enteritis, all forms of acute 
diarrhea, dysentry and cholera infantum and makes up the 
largest part of the immense summer mortality. It is these 
diseases which cause regularly each year the sharp rise in 
the death curve in July and August.” In citing these quota- 
tions from Doctor Holt’s paper, I have deliberately brought 
together his authoritative statement of the fundamental 
causes of Infant Mortality—poverty, ignorance and neg- 
lect-—and his comment on the commanding importance of 
diarrheal diseases—especially in connection with the immense 
summer mortality—although they were not associated in 
his address. For, it seems to me, if a layman may venture 
to express an opinion on a phase of the Infant Mortality 
problem which the specialists in pediatrics are so much more 


| 
; 


350 American Statistical Association. [64 


competent to discuss, that the fundamental causes named by 
Doctor Holt—poverty, ignorance and neglect—in the natural 
course of things are much more potent factors in the summer 
months in making the Infant Mortality rate what it is than 
in any other season of the year. 

In his paper on “Infantile Mortality and Its Principal 
Cause—Dirty Milk,” the fate Dr. Charles Harrington, 
secretary of the State Board of Health of Massachusetts, 
remarked apropos of the seasonal distribution of Infant 
Mortality: “From the facts and figures thus shown it might 
be inferred that all infants under one year of age are in great 
danger during the hot summer months, but this is far from 
being the case. Not the three summer months, but the 
first three months of life, are the dangerous period.”’ Un- 
questionably true though this statement be, so compara- 
tively few accurate records of infants’ deaths by ages ex- 
pressed in months are available and there is such a wealth 
of information as to the seasonal distribution of Infant Mor- 
tality, that public attention has much more graphically 
been drawn to the abnormal dangers of the summer months, 
and in my judgment the unusual Infant Mortality of that 
season of the year offers the foremost strategic point of effec- 
tive attack for movements like that of this Association. Of 
course the fundamental causes of which Doctor Holt spoke 
are operative the year around, but it is in the summer months 
that those luxuries for the poor, abundant ice and pure milk, 
play the most important part in determining whether the 
babies of the poor shall live or die. Then it is that the un- 
bearable heat of the tenements drives their unfortunate oc- 
cupants of all ages into the streets, to the fire-escapes, and to 
the roofs, and then it is, as I see it, that poverty, ignorance 
and neglect, developed as it were by the summer heat, are 
most apt to exercise their baneful influences in raising the 
Infant Mortality rate to its top notch. 

With the world at large it is the array of large figures, and 
not mere percentages, which makes the deepest impression. 
Consequently, in endeavoring to bring out the importance 
of the summer infant death-rate I have first tabulated a 
comparative statement, by weeks, of the births and infant 
deaths in the greatest city on this continent—and its various 
boroughs—in the third quarter of 1910 and 1909, and have 


x 
i 
be 
| 


65] Infant Mortality. 351 


then emphasized the disheartening regularity of that sharp 
rise in Infant Mortality by a somewhat similar, though less 
detailed, presentation of corresponding figures for an entire 
state, Connecticut, for which monthly comparative figures 
for both years were available. The figures in Tables IV, V 
and VI, not only show how conspicuously the Infant Mor- 
tality rate mounts up in that season of the year, but how in- 
flexible the increase seems to be, despite all the efforts now 
being made to grapple with it. The contrast between the 
summer rate and the annual rate of Infant Mortality is sharply 
brought out by the fact that in the third quarter of 1909 the 
ratio of deaths under age one to registered living births in 
the City of New York was 169 per 1,000, as against a ratio 
of only 130 for the entire year 1909—that is to say, was 
larger by an even 30 per cent.—and that the infant deaths 
in that quarter amounted to 32.15 per cent. of the total for 
the entire year. In the state of Connecticut, the infant 
death rate in the third quarter of 1909 was 192, as compared 
with one of 131 for the entire year 1909, an excess of nearly 
47 per cent. 

But those statistical facts are mere mathematical demon- 
strations of a well-known truth; a much more important 
showing of the tables is question in the fact that, despite 
all the wide fluctuations for individual weeks and individual 
boroughs in 1909 and 1910, the infant death-rate of the City 
of New York in the third quarter of 1910 was precisely identi- 
cal with that for the corresponding quarter of 1909, namely, 
169 per 1,000 registered living births. And the seeming in- 
flexibility of summer Infant Mortality is strongly confirmed 
by the fact that in the entire state of Connecticut, of course 
including the rural districts along with the cities, the infant 
death-rate in the third quarter of 1910 was practically identi- 
cal with that for the corresponding quarter of 1909, being 193 
per 1,000 registered living births this year as compared with 
192 per 1,000 registered living births in 1909. To be sure, 
comparisons for only two years are by no means convincing, 
but it is at least notable that in the case of both one of the 
world’s greatest cities and an adjoining state with a widely- 
Scattered population scarcely one-fifth as large as that of 
the metropolis, in the third quarter of 1909 and 1910 the 
respective infant death-rates for both years should be prac- 

5 


352 American Statistical Association. 


[66 


tically identical. If allowance were to be made for the prob- 
able slight improvement in the registration of births in both 
instances in 1910, it would mean that the actual infant death- 
rates for the third quarter in both cases were slightly larger 
in 1910 than in 1909. But, in any event, the figures seem to 
show that, in so far as merely two-year records can tell the 
story, the Infant Mortality in both cases was quite as high 
this year as last year, to say the least. 

The final tables, VII and VIII, contain a succinct state- 
ment of the number and percentage of infant deaths in the 
registration area of the United States in the last ten years 
due to the four principal causes of Infant Mortality, aside 
from diseases of the respiratory system, namely, diarrhea 
and enteritis, those diseases of early infancy, premature 
birth and congenital debility, and that cause of death, perhaps 
more or less closely allied with the causes which contribute 
to premature birth and congenital debility, to wit, mal- 
formations. In the decade 1900-1909, these four classes of 
causes of death were responsible for no less than 52.05 per 
cent. of all the deaths under age one recorded in the registra- 
tion area of the United States, and in the latter half of that 
period the percentage of deaths due to them was considerably 
higher than in the former half, namely, 54.19 as compared 
with 49.33. In every case except that of congenital debility, 
the percentage of deaths from each of these four causes was 
higher in 1905-1909 than in 1900-1904. Probably the in- 
clusion of five additional states with many heavily-populated 
cities in the registration area in the latter half of the period 
has had something to do with increasing the percentage of 
deaths due to these causes in 1905-1909 as compared with 
1900-1904, but my recollection is that a similar tabulation 
of deaths from these causes in certain large manufacturing 
cities for a long series of years which I prepared some years 
ago showed an almost unbroken increase from year to year 
in each of these cities. As to this phase of the subject, the 
statistician’s work properly ends when he has tabulated, and 
presented in proper form, the actual figures; the discussion 
of the reasons for the fact, and the significance of it, does 
not come within his province, and that branch of the subject 
of Infant Mortality belongs to, and should be left with, 
the medical profession. 


a 
> 


67) Infant Mortality. 353 


TABLE I. 

THE INFANT MORTALITY RATES OF 31 PRINCIPAL COUNTRIES, BY FIVE- 
YEAR PERIODS, 1881-1905, AND THE ANNUAL INFANT MORTALITY RATES 
OF 16 OF THOSE COUNTRIES, 1906-1908. 


Deaths under Age 1 per 1,000 Living Births. 


Countries. 
1881-| 1886—|1891-—| 1896-—| 1901— 1881—/1906 .| 1907. 1908.|1906- 
| 1885.) 1890.) 1895.) 1900. 1905.| 1905. 1908. 


‘93 | “92 | 97 


ungary 
Russia in Europe.. 


Averages for Europe... . _ 163 | 162 | 169 | 162 | 153 | 162 | 146 | 141 | 145 | 144 
New Zealand........... 90 | 84 87 | 62 89 
OS | 109 | 103 O4 98 90 99 91 82 
South Australia........ | 101 | 105 99 | 112 | 87) 101*| 76 66 
| 136 | 119 | 103 | 104 95 | 111 75 
73 
98 


RECAPITULATION. 


ee. 162 | 153 145 | 144 
Australasia............ 117 | 111 | 105 | 111 | 95 108 82| 76| 80 
Other Lands........... 184 | 177 | 206 | 207 | 208 | 196 | 241 | 235 | 226 | 234 
tGrand Averages....... | 155 | 152 | 159 | 157 | 147 | 154 | 136 | 133 | 130 | 133 

| 


* Returns for one or more years wanting, and averages have been calculated on basis of 
returns for other years of period in question. 

t Computed by division of totals for all countries represented in table by number of 
countries in question. 

Bold-faced represent estimates for periods for which no returns were available, 
— in each case being average of actual returns for balance of entire twenty-five-year 


"The above table has been compiled in part from Table III in Phelps’ “A Statistical 

Study of Infant Mortality,” in the it Publications of the American Statistical 

tion, New Series, No. 83 (Vol. XI), tember, 1908, and data for years — 

Seventy-First Report of the Registrar-General for Eag- 
Ixvii ; 


94 | 95] 102 | 106 98 | 99 94 
116 | 105 | 103 | 101 | 92% 104% ... |... |... |... 
Bulgaria..............-| 81 | 140 | 143 | 120% 4]... 
117 | 121 | 126 | 129 120 | 
Finland...............| 162 | 144 | 145 | 139 
England and Wales.....| 139 | 145 | 151 | 156 | 138 | 146 | 132 | 118 | 120 | 123 
Switzerland............| 171 | 159 | 155 | 143 | 134 | 153]... |... ]...]... 
Belgium...............| 156 | 163 | 164 | 158 | 148 | 158]... ...]...]... 
Servia............++++-| 157 | 158 | 172 | 159 | 149 | 159 | 144 | 147 | 158 | 150 
167 | 166 | 171 | 159 | 139 | 160]... |... |... |... 
The Netherlands.......| 181 | 175 | 165 | 151 | 136 | 162 | 127 | 112 | 125 | 121 ; 
275 | 175 | 185 | 168 | 168 | 175% ... | ... |... | 
193 | 186%] 185 | 185 | 173 | 185% ... |... |... 
Prussia................| 207 | 208 | 205 | 201 | 190 | 202 | 177 | 168 | 173 | 173 
Roumania.............| 182 | 195 | 220 | 216% 203 | 208% ... | ... |... |... 
Austria...............-| 223 | 223 | 223 | 226 | 213% 293% ... | ...]... 
-s++| 226 | 226 | 250 | 219 | 212 | 226*| 205 | 208 | 199 | 204 
Averages for Austral-| 
asia 117 | 111 | 105 | 111} 95 | 83! 82] 76] 80 
Ceylon.............+--| 158 | 158 | 169 | 168 | 171 | 165 | 198 | 186 | 183 | 189 
Jamaica gu ...| 158 | 170 | 171 | 175 | 174 | 169 | 197 | 223 | 175 | 198 
314 | 264 | 386 | 333 332* 314*| 328 | 297 | 320 | 315 
Averages for Countries 
Named__.......| 184 | 177 | 206 | 207 | 208 | 196 | 241 | 235 | 226 | 234 
| | | | | | | | 


354 American Statistical Association. [68 


CENSUS 1900-1909. 


TABLE II. 


A COMPARISON OF DEATHS BY AGE GROUPS WITH TOTAL DEATHS AT ALL 
AGES IN THE REGISTRATION AREA OF THE UNITED STATES, AS SHOWN 
BY THE ANNUAL MORTALITY STATISTICS OF THE BUREAU OF THE 


Between 
1 and 5 
Years. 


52,450 
44,201 
44,940 
524,415 43,083 
551,354 43,022 
545, 533 41,831 
658, 105 53, 873 
687,034) 131,110! 52,664 
691,574) 136,432) 53,433 
732,538} 140,057) 56,477 


Deaths in the Registration Area. 


Ratios to Total 
Deaths at All Ages. 


Under 
5 Years. 


At All 
Ages 


over 
1 Year. 


Under} 1-5 | Under 
1 5 


164,137; 428,252 
141,678, 420,730 
143,515) 410,065 
139,940) 427,558 


145,902 
147,384 
186 ,978 
183,774 
189, 865 
196 , 534 


448,474 
439 , 980 
525,000 
555,924 
555, 142 
592,481 


2,642,555) 507,476) 227,696 


5,957,339) 1,153,733) 485,974) 1,639,707 


735,172 


2,135,079 


3,314,784) 646,257) 258,278 


904 , 535 


2,668,527 


Years | 
ore 
ear. 1 
| Year. Year. 
1900 | } 20.7 9.7 | 30.4 | 79.3 
1901 18.8| 8.5 | 27.3 | 81.2 
1902 19.4 | 8.8 | 28.2 | 80.6 
1903 18.5 | 8.2 | 26.7 | 81.5 
1904 7.8 | 26.5 | 81.3 
1905 | 19.3 7.7 | 27.0 | 80.7 
1906 20.2 | 8.2 | 28.4 | 79.8 
‘ 1907 19.1 7.7 | 26.8 | 80.9 
4 1908 119.7 | 7.7 | 27.5 | 80.3 
1909 19.1 | 7.7 | 26.8 | 80.9 
Total 4,803,606] 19.4 | 8.2 | 27.5 | 80.6 
1900- 
19.2 | 8.6 | 27.8 | 80.8 
1905- 


“6061 10} COUBAPE O94} JO N¥oINg 94} 


| | | | “206 


ech 


£69 ‘OT 
909‘ TT 


ASHS| 


RARKR 


N 
N 


= 
5 
2 


| ‘soBy || | 


OL 


MON 


| 


"6061-0061 ‘SLUOUAU NOILVULSIDAY IVOANNV Ad NMOHS SV ‘MUOA MAN AGNV 
-VSSVW ‘LOOLLOUNNOO NI TIV LV SHLVAG ‘IVLOL GNV ‘UVGA I SH.LVAC ‘SHLYUIA ONIAIT AO NOSTUVAWOO V 


“Ill 


4 | 4 
| | | 
| 
| | 
| 
| 1] 
| | 8 
| 
| 
7 | > 
| 
| 
: 
QAANKQ WSS 
| | 
| 
| ANNNAN 
| 
| - ian i 
| 
| 
| @ 
<2 He 
2< 
|e | | 


payq? 3203 
SFI | 6e BPI 901 | 669 got |ezt | est't 
o tte 26 68 | 286 | O1@ | Tit Toe't | 290 
‘ | 
691 £0F ese'z | ¢ aL lz 60T £28 OI6T ‘OI “3dog 
Lee | FI Le 6% £91 +6 ‘ws | 106 | 6061 “3dog 
199'2 | 6 09 £% OLT 96 | 8% | 00% 602 OI6I ‘g “3deg 
198 | OL LI Fd 401 LOL 969 606T ‘Sz 
S wut £68 OF 601 LIT (92 | 6061 ‘Iz 
Sore g1e'z | 1g 6IT 9FI zee | 0% | 82% 6061 ‘FI “any 
S 20 98% | OF ce 991 606T ‘Ie Aine 
sur LYE sis'z | of 9% 101 | 606T ‘Fz Ane 
668 | It ce Ig +91 191 192 ST Ist SLT 606T ‘ZT Aine 
08% 899 | SI 98 9F 60Z 126 ‘9T Aine 
PLT | 6 zg se | 291 | «BOS “6 Aine 


‘ALIO 40 HLIVGH 
40 LINANLUVdad AHL AO SLUOdTN ATUAAM AHL WOUA AATIMWOO ‘OI6T 6061 ‘ALITVLUOW LNVANI LSHIAVGH AHL 


NOSVUS AHL NI ALIO HUOA MAN JO SHONOUOT AHL NIT AV UAGNNA SHLVAC ANV SHLUIA JO NOSIUVAUNOO V 
“AI 


: 
] 
] 
x 
q 


#22 


5.517 


71) 


Infant Mortality. 


TABLE V. 


357 


A COMPARATIVE RESUME OF BIRTHS AND DEATHS UNDER AGE 1, IN THE 
BOROUGHS OF NEW YORK CITY IN THE SEASON OF THE HEAVIEST 
INFANT MORTALITY, 1910 AND 1909, COMPILED FROM THE WEEKLY 
REPORTS OF THE DEPARTMENT OF HEALTH OF THE CITY. 


Boroughs of the City of 
York. 


ew ior 


Third Quarter of 1910. 


Third Quarter of 1909. 


Deaths under 


= 1 Year. 
Living 
Births. ‘Per 1,000 


Number.|" Births, 


Living 
Births. 


Deaths under 
1 Year. 


!Per 1,000 


Number. Births. 


16,505 
2,821 
11,065 
1,781 
562 


2,940 
358 
1,736 
348 
135 


15,681 
2,290 
10,324 
1,571 
497 


168 


2,632 
321 
1,725 
321 
138 


30, 363 


TABLE VI. 


A SIMILAR RESUME, BY MONTHS, FOR THE STATE OF CONNECTICUT, 
COMPILED FROM THE MONTHLY BULLETINS OF THE CONNECTICUT 


STATE BOARD OF HEALTH. 


Third Quarter of 1910. 


Third Quarter of 1909. 


Months. 


Deaths under 
1 Year. 
Per 1,000 
|Number. Births. | 


Living 
Births. 


Deaths under 
1 Year. 


Number 1,000 


2,115 
2,290 
2,180 


6,585 


4 
in: = 
ir = 
| 
Manhattan ... | 178 | | 
The Bronx 127 140 
| 157 167 
od Queens . . 195 204 
|# 240 278 
= Totals 32,734 | 5,517 169 | 137 169 
| 
2,363 587 248 407 192 
| 404 172 483 211 ; 
an ee 343 155 373 171 
Totals ....| 6,926 | 1,334 | 193 | | 1,263 | 192 
ee 


358 American Statistical Association. (72 


TABLE VII. 


A REVIEW OF THE INCREASING WASTE OF INFANT LIFE DUE TO THE FOUR 
PRINCIPAL CAUSES OF INFANT MORTALITY, IN THE REGISTRATION 
AREA OF THE UNITED STATES, AS RECORDED IN THE ANNUAL MOR- 
TALITY STATISTICS OF THE BUREAU OF THE CENSUS, 1900-1909. 


Deaths 
under 
1 Year 


from 
All Causes. 


Deaths under 1 Year from Four Principal Causes. 


Diarrhea 


an 
Enteritis. 


Malform- 
ations. 


Total Deaths from 
These Four Causes, 


Percent- 
age of all 
Deaths 
under 
1 Year. 


111,687 
97,477 
98,575 
96 , 857 

102,880 

105, 553 

133, 105 

131,110 

136 ,432 

140, 057 


27,627 
23 , 357 
21,912 
22,202 
25,286 
27,455 
35,220 
34,408 
37,049 
36,516 


10,170 
8,615 
9,087 

10,143 

11,361 

11,102 

14,250 

15,245 

16,441 

18, 286 


12,371 
12,640 
12,515 
15,493 
15,392 
15,833 
14,988 


3,165 
3,677 


4,046 | 
4,299 | 


5,857 


6,057 
6,525 | 
7,286 | 


47,215 
46,888 
48,393 
53,333 
55,371 
70,820 
71,102 
75,848 
77,076 


48.80 
48.44 
47.57 
49.96 
51.84 
52.46 
53.21 
54.23 
55.59 
55.03 


1,153,733 


291,032 


124,700 


137,547 


47,275 


| 600,554 


52.05 


507 ,476 


120,384 


49,376 


63 , 326 


17,251 


250,337 


49.33 


646 , 257 


170,648 


75,324 


74,221 


30,024 


350,217 


54.19 


Infancy. 
Years. 
4 ture tal Number. 
x | Birth. | Debility. 
| | 13,484 | 54,508 
| 12,107 | 3,136 | 
| | 
| 
: 1906........ | | 
| | 
| | 
| 
Totals 


73] Infant Mortality. 


TABLE VIII. 


359 


A RECAPITULATION OF DEATHS UNDER AGE 1 DUE TO EACH OF THE FOUR 
PRINCIPAL CAUSES OF INFANT MORTALITY IN THE REGISTRATION 
AREA OF THE UNITED STATES AS SHOWN BY THE DETAILED STATIS- 


TICS IN TABLE VII. 


Causes of Death. 


Deaths under 1 Year in the Last Decade. 


Ten-year Period, 
1900-1909. 


| First Half of 
‘Period, 1900-04. 


Second Half of 
Period, 1905-09. 


Deaths. 


Per 
Deaths. 
tal. Total 


Deaths. 


1,153,733 


507 ,476 


| 
646 , 257) 


Diseases of early infancy. 
Premature birth 
Congenital debility 


Malformations 


291,032 


124,700 
137 , 547 
47,275 


120,384 


49,376 
63 , 326 
17,251 


170,648 


75,324 
74,221 
30,024 


Total, four above-named causes.. 


600, 554 


250, 337 


350,217 


P Per 
cen f| | cent of 
T Total. 
Diarrhea and enteritis...........| 25.23) 23.72 26.41 
10.81 9.73 11.66 
11.92 12.48 11.48 
| 4.10 | 3.40 4.65 
— 52.05 — 49.33 54.19 


American Statistical Association. 


THE CENSUS AGE QUESTION. 


By Auiyn A. Youna, Professor of Economics, Leland Stanford Junior 
University. 


I. 


In these Publications for June, 1910 (pp. 110-123), Professor 
William B. Bailey and Mr. Julius H. Parmelee printed the 
results of a study of the age returns of the census of 1900. 
They have brought to light a number of new and interesting 
facts which must be taken into account by every serious 
student of our population census. Even more important are 
the conclusions to which the interpretation of these new 
results has led them, for they have decided that the fact that 
in 1900 (for the first time in the history of the federal census) 
the enumerators were required to ascertain the year and 
month of birth of every person enumerated in addition to the 
age, did not materially affect the quality of the age statistics 
of that census. 

This conclusion is of interest because it is sharply at vari- 
ance with former views on the subject. But it gains further 
importance from the fact that it led to the omission of the 
date of birth question in the census of 1910. These consid- 
erations may be held to justify an analysis of the report, with 
a view to ascertaining the roots of its inferences and testing 
the validity of its conclusions. Before entering upon this 
task it will be well, I think, to outline some of the various 
considerations that might lead one to imagine that the date 
of birth inquiry should have had some beneficial effect upon 
the age statistics in question. These considerations are of 
unequal weight, and I have arranged them roughly in what 
seems to me the inverse order of their importance. 

1. It may be taken, I think, as a fair and reasonable general 
presumption, that a question about a thing so precise and 
definite as the year and month of one’s birth will elicit in 
general more accurate answers than a question as to age, 
even when enumerators are especially cautioned to guard 


4 
[ 
‘ 
a 
; 
1 
I 
| 
i t 
i 
( 
t 


75] The Census Age Question. 361 


against loose and inexact answers and against the tendency 
to state age in round numbers. It frequently happens, I have 
found, that after persons have passed a certain age and the 
years have begun to slip by rapidly, their age in years is not 
kept definitely in mind, and an answer to an inquiry as to 
age is apt to involve a mental subtraction of the present 
year from the year of birth, as being on the whole easier than 
an attempt to bring to account the fugitive memory of 
age in years. I am not disposed to lay much stress on this 
consideration, for there is no way of determining how preva- 
lent the habit mentioned is. But even if in the great majority 
of cases one’s age is more definitely kept in mind than one’s 
date of birth, there still remains the presumption (reasonable 
enough, I suggest, to be at least taken into account in any 
thorough consideration of the matter) that the very pre- 
cision of the date of birth question should emphasize, both 
to enumerator and enumerated, the necessity of an accurate 
and precise answer to the age question. 

2. The date of birth inquiry was recommended by the 
International Statistical Congress in 1872, and no subse- 
quent international statistical gathering has reversed or 
modified the recommendation. 

3. In the majority of European censuses, and especially 
in those countries* in which statistical practice has reached 
the highest level, date of birth, rather than age, is asked.t 


* Austria, Belgium, Germany, Holland, Hungary, Italy, Norway, Sweden and Switzer- 

may be mentioned. 

t Ido not mean to suggest that European census methods are inherently better than our 
own. But European practice in these matters affords a fairly good indication of where the 
weight of the opinion of qualified statistical experts may be supposed to rest. I think it 
worth while in this connection to quote one writer whose deservedly high reputation rests 
upon achievements in both statistical administration and statistical analysis: 

Asking the age directly is the less common method, and is becoming obsolete, although 
it was used in the English and French censuses of 1891. Usually it is limited to the deter- 
mination of the number of completed years of life. Since censuses are not often taken 
at the end of a calendar year, the answers to this question do not permit us to ascertain the 
distribution of the population in objective groups of calendar years of birth. Moreover, 

§ form of question increases the degree of uncertainty in the answers, for the of an 
individual is a 9 | fact, every now and then to be reckoned anew. Mistakes happen 
in such reckonings, and it is even more often the case that the computations are not ser- 
iously undertaken, but round numbered estimates are set down ins . Further difficulties 

t from the fact that this question tends to confuse the year of age, properly so called 
is, the sum of the years of life entirely completed and passed by), with the year of life, 
rly so called (that is, the year of life in which an individual yet remains, not having 

y completed it). The question as to the immutable fact of the date of birth (year, 
month and day) is therefore ney more to the point, and is to be preferred wherever 
t level of pepuier education is high enough to permit of its being generally answered. 

© ask merely the calendar year of birth does not serve the pemete, and is also attended 
by the difficulty that it does not make it possible to determine the actual of the per- 
= enumerated unless the census happens to be taken exactly at the end of the year.” — 

+ Von Mayr, Bevdlkerungstatistik, p. 74. 


362 American Statistical Association. [76 


4. A comparison of European censuses in which the age at 
last birthday is asked with those in which the date of birth 
is asked shows that the age statistics of the latter are in 
general more accurate than those of the former.* 

5. The age returns of the United States Census of 1900 
were distinctly more accurate than those of any previous 
federal census. It is important to note that this increased 
accuracy showed itself in four ways: (1) There was a decrease 
in the overstatement of the ages of young children.{ (2) 
There was a decided lessening of the concentration of reported 
ages on years constituting multiples of five and ten.{ (3) 
There was as compared with prior censuses a general smooth- 
ing of the whole age series§. This improvement, as meas- 
ured by the simple index which I have called the “coefficient 
of error” involved more than the mere reduction of the con- 
centration on round numbers, for while in this last respect 
the improvement in the census of 1890 over that of 1880 
was as marked as the improvement in 1900 over 1890, the 
‘coefficient of error’”’ was only 3.4 per cent. in 1900 as against 
7.5 in 1890 and 8.2 in 1880. (4) The number of reported 
centenarians was reduced from 8.0 per 100,000 population 
in 1880 and 6.4 per 100,000 in 1890 to 4.6 per 100,000 in 1900. 


II. 


The investigation of Professor Bailey and Mr. Parmelee 
is based on a careful examination of original schedules of 
the census of 1900 from five large cities and five rural coun- 
ties, embracing in all 130,000 enumerated persons. It was 
found that in 12,526 cases,—a little less than ten per cent. 
of the entire number,—the reported date of birth disagreed 
with the reported age. In some cases other facts reported 
regarding the person enumerated were prima facie evidence 
as to which of the two discordant statements was correct, or 


* I have given such a comparison, so far as the statistics of children’s ages are concerned 
in these Publications, Vol. VII, p. 237, and in Twelfth Census, Supplementary Analysis, Dp. 
140. I have also applied simple tests to the ages of adults as reported in the same cen- 
suses. The comparison indicated the superior accuracy of the date of birth inquiry, but 
the results seemed scarcely worth publishing, as I thought (I confess) that the matter was 
hardly open to question. The statement made in the text is, however, one which may 
easily be verified. 

t Twelfth Census, Supplementary Analysis, pp. 139-143. 

t Ibid., p. 136. 

§ Ibid., pp. 134-137. 


Ale 


77] The Census Age Question, 363 


more nearly correct. In forty-four out of a random 
selection of fifty of such cases it was found that the reported 
age was more nearly correct than the reported date of birth. 
Of the eight cases given in full as being typical, seven involve, 
beyond reasonable doubt, errors of one kind and another in 
the statement of the date of birth. These facts seem to 
indicate clearly that where there are discrepancies between 
the reported dates of birth and the reported ages and where 
these reports can be controlied by other facts reported in 
the schedules, the reported ages are in general more trust- 
worthy than the reported dates of birth. A corollary of this 
finding is that where there were such discrepancies, the date 
of birth was more often estimated from the age than the age 
from the date of birth. 

It is not a necessary additional inference, however, that 
where the reported age and the reported date of birth agreed 
(that is, in over nine tenths of the 130,000 cases examined) 
the date of birth was more often the dependent and age the 
independent statement.* In the first place it is obvious that 
where neither age nor date of birth were known, the age must 
have been roughly estimated first, and then the date of birth 
computed. Doubtless this happened frequently in the 
enumeration of the more ignorant classes of the population, 
and in other classes whenever the information about the 
persons enumerated was not given by themselves or by 
members of their immediate families. In the second place, what- 
ever the attending circumstances were, every case of dis- 
agreement between the reported age and the reported date 
of birth is a mark of carelessness on the part of the enume- 
rator. Now there is unmistakable evidence in the age 
returns themselves that where enumerators were working 
among an illiterate class of the population they lowered their 
own standards of accuracy.t The same effect would have 
been produced, I imagine, where for any other reason, such 
as linguistic barriers, or inability to get returns directly from 


* There is the further ibility that in many cases both date of birth and age were 

accurately known and oO mene | stated. The essential point, after all, is the effect of 

the date of birth question upon the precision of the returns. This is more important 
the question of the dependence or independence of one or the other statement. 


t Twelfth Census, Supplementary Analysis, pp. 136, 137. 


4 
| i 
« 
; J | 


364 American Statistical Association. [78 


more than a small fraction of the persons enumerated, the 
enumerators found that their age returns were mere guesses. 
It would have seemed futile to take great pains in setting 
down the dates of birth, when the ages (in such cases, at. 
least, the first estimates) were known to have little precision. 
The upshot of these considerations is to indicate a more or 
less cogent possibility that the 12,526 returns, in which dis- 
agreements between the reported dates of birth and reported 
ages were found, were not fairly representative of the 130,000 
cases examined (and, a fortiori, of the age returns of the 
Twelfth Census in general), but that they included a somewhat 
larger proportion of inaccurately known ages, and, hence, 
of cases in which the date of birth was estimated more or les 
carelessly from the reported age. There is still another pos- 
sibility that is at least worth mentioning: namely, that, as 
it was understood by the enumerators that the date of birth 
inquiry was only a method of ascertaining age, computations 
from the reported age back to the date of birth were on 
that account less carefully executed than computations from 
the reported date of birth to the age. This might help explain 
the prevalence of the first kind of computations among the 
cases of disagreement. 

More than two thirds of the 12,526 cases of disagreement 
were, however, of a specialized type,— that is, in over two 
thirds of these instances the reported age was just one year 
greater than would have been consistent with the reported 
date of birth. In these cases there was not enough discrep- 
ancy between the two reports to permit their being often 
tested by their harmony or lack of harmony with other 
facts on the schedules.. I think it fair to infer (and the infer- 
ence is supported by the eight cases previously mentioned as 
given in detail) that the fifty tested cases of disagreement 
were drawn in large part from the miscellaneous one third 
of the cases of disagreement and in small part from the 
specialized two thirds. This at once raises a doubt whether 
the inference drawn from the fifty controlled cases (that in 
cases of disagreement the age is more often correctly stated 
than the date of birth) can be extended to these predominant 


79) The Census Age Question. 365 


and specialized cases of disagreement. Professor Bailey* 
seems to have experienced this doubt, for although he is 
convinced that in these cases the date of birth was computed 
(erroneously) from the age, he rests his case on what seem 
to him the inherent probabilities of the matter,f supported 
by the fact that in several cases there is apparently clear 
evidence that the enumerator had computed and entered the 
years of birth at the close of his day’s field work. 

But there is a further peculiarity in this class of incom- 
patible returns, for ‘it was found that in all these cases almost 
without exception the birth month was one of the last seven 
months of the year.” As the information on the census 
schedules is supposed to represent the ages as they were on 
June 1 of the census year, it is obvious that there is some 
relation between this concentration of discrepant dates of 
birth and the date of the census. To borrow a concrete illus- 
tration from the paper under review: ‘“X is returned on the 
schedule as born in September, 1865, and as being 35 years 
of age. . . . Therefore, if the date of birth was correct, 
X was not 35 years of age on June 1, 1900, but only 34; if 
on the other hand he was, in fact, 35 years of age in June, 
1900, and was born in September, then the year of his birth 
was not 1865, but 1864.” Now if we proceed on the assump- 
tion that in this and similar cases of disagreement, either 
the reported age or the reported date of birth, the one or the 
other, accurately represented the truth with respect to the 
person enumerated, I see no way in which we can, without 
begging the question, assume that the age was computed 
from the date of birth, or, per contra, that the date of birth 


* This part of the article under review is written in the first person singular. 


t “ To the writer, however, this [that the year of age was obtained in these cases by sub- 
traction from the date of birth] seems most unlikely, because it is his belief and observation 
that most people keep a better mental record of their age than of their year of birth, and 
will answer more rapidly and neg an inquiry as to age in years than as to year of 
birth, whether the inquiry applies to lves or to some relative, friend, or acquaint- 
ance.”—loc. cit., p. 114. 


.2 am inclined to agree with Professor Bailey’s opinion so far as it relates to the ages of 
tives, friends, or acquaintances.” So far as one’s own age is concerned I have ex- 
above a variant opinion, which, however, may be taken as relating only to the 

— habits of intelligent persons of mature age. But, as we shall see later, the point is 
ly of minor importance. It may be of interest, however, to note that in European 
censuses in which the date of birth is asked and which are taken in such years that round 
numbered years of birth do not coincide with round numbered ages, the concentration is 
on round numbered years of birth rather than on round numbered ages. This seems to 
Prove conclusively that in those countries, at least, the date of birth is not very frequently 
1888) from the age. (See especially the analysis of age statistics in the Swiss census of 


| 


366 _ American Statistical Association. [80 


was computed from the age. If, however, some weight can 
be conceded to the tentative suggestion that these cases of 
disagreement constitute an especially inaccurate set of age 
returns, it becomes plausible that the ages were in generai 
independently though loosely stated, and the date of birth 
computed in a careless manner from the reported age. 

The census was taken in a round numbered year and there 
was a partial coincidence between round numbered years of 
birth and round numbered ages. It would seem at least possi- 
ble that in cases where a round numbered age was set down as 
a loose approximation, a round numbered date of birth should 
have been set down in similarly careless fashion. The writers 
fortunately furnish a table showing the distribution of the 
final digits of the reported ages in 8,851 cases of these one- 
year discrepancies. Twenty-five per cent. of these reported 
ages end with either 0 or 5. The corresponding per cent. 
for the entire population in 1900 was 21.2.* The difference 
is not great, and I am not sure that it is at all significant, for 
the one-year discrepancies may have been especially signifi- 
cant in the reported ages of adults, where the concentration 
on multiples of 5 is most noticeable. So my hypothesis 
seems to fall to the ground so far as these one-year discrep- 
ancies are concerned, and I see no objective method of deter- 
mining for these cases whether the reported date of birth or 
the reported age should be considered more frequently correct. 


III. 


Professor Bailey and Mr. Parmelee have, however, compiled 
a table which, if interpreted in a certain way, seems to add 
considerable weight to their contention that ages, rather than 
dates of birth, were independently stated. This table shows 
the proportions in which ages reported as between twenty- 
three and sixty-two years inclusive were distributed among 
the years ending in each of the ten digits. Such figures are 
given for the aggregate population as reported in 1900, 1890, 
and 1880, and for the population in 1900 as redistributed by 


* This is the per cent. which reported ages ending in 0 or 5 made of the aggregate number 
of persons reported as one year ale over. The number of children —— as under one 
is excluded from the computation as this number could not have included any one-year 


age discrepancies. 


‘i 
3 
: 
< 
a 
4 


$1] The Census Age Question. 367 


interpolation within the successive five-year age groups. 
This device shows very clearly the reduced concentration 
on multiples of five in 1900, and it also shows the correspond- 
ing relative increase in the numbers reported at certain other 
ages. The significant thing is an especially noticeable swell- 
ing of the numbers reported at ages ending in 9 or 4. From 
what we know about the nature of the errors in age returns 
we would have expected this decrease in the concentration 
on round numbers to fill up the ages immediately above the 
round numbers rather more than the ages immediately below 
them.* Yet the table seems to show that the apparently less 
probable result happened. 

The explanation offered by Professor Bailey and Mr. 
Parmelee is essentially as follows: When the schedules were 
edited in the census office, such cases of discrepancies between 
the two age reports as were noticed were adjusted on the- 
assumption that the date of birth was the more correct return. 
As a result, when the reported age was a year greater than 
the reported date of birth warranted, it was reduced by one 
year. And since there was a concentration of the reported 
ages on round numbers, by this process of adjustment the 
years ending in 4 and 9 lost less to years ending in 3 and 8 
than they gained from years ending in 5 and 0. Similarly, 
years ending in 5 and 0 lost more to years ending in 4 and 9 
than they gained from years ending in 4 and 9. If this expla- 
nation is adequate, the abnormal swelling of the ages ending in 
4 and 9 indicates, it is thought, that the corrections made in 
the census office were not well advised, the reported ages 
being more accurate than the reported date of birth. It also 
indicates,— what is even more to the point,— that part of 
the diminished concentration on round numbers in 1900 is 
apparent rather than real, having been brought about by the 
arbitrary clerical process described. t 

In fact, Professor Bailey and Mr. Parmelee estimate (on the 
basis of the distribution of ages in 8,851 cases of one-year 

*See these Publications, Vol. VII, p. 38. 

t It should be noted, however, that even if the clerical adjustments described were well 

vised they would nevertheless have brought about an appreciable swelling of the numbers 
Teported at ages ending with the digits 4 or 9. But in this case it would scarcely be accurate 


to call the accompanying reduction of the concentration on round numbered ages an “‘ ap- 
parent " rather than a “ real ” improvement. 


i 


368 American Statistical Association. [82 


adjustments of this kind) that this process was responsible 
for a decrease of 2.9 per cent. in the concentration on round 
numbers in 1900. By this much, then, the addition of the 
date of birth inquiry would seem to have been credited with 
more than its real effect in the improvement of our age 
statistics. 

The difficulty with the foregoing explanation is that it 
by no means explains the total amount of increase in the rela- 
tive number of reported ages ending in 4 and 9. The number 
of persons reported by the census at ages ending with the 
digit 9 was (proportionately to the total number between 
twenty-three and sixty-two years) greater by 12 per cent. in 
1900 than it was in 1890. I estimate that the artificial shift- 
ing described above (if it was as frequent as Professor Bailey 
and Mr. Parmelee estimate) was responsible for between 2 
and 3 per cent. in this increase, that is, at most, not more than 
one fourth or one fifth of it. The general reduction in the 
concentration on round numbers was responsible for prob- 
ably not more than 1 per cent. out of the 12.* Altogether 
about two thirds or three fourths of this increase remains to 
be accounted for. In other words, this statistical evidence 
proves too much. 

There is another possible explanation of the swelling of the 
reported ages ending in the digits 9 and 4 which appeals to me 
as less labored and more adequate. The date of birth inquiry 
preceded the age inquiry on the schedule used by the enumer- 
ators in 1900. It is reasonable to suppose that in many 
if not the majority of instances the enumerators asked the 
date of birth before they asked the age. Among the answers 
to the date of birth inquiry there must have been, as in 
European census experience, more or less concentration on 
round numbered years. The more careful enumerators, at 
least, would have taken pains to see that the reported age 
agreed with the reported date of birth. Where the month of 
birth assigned happened to be one of the first five months of 


* A decrease of 8.7 cent. in the relative number of reported ages ending in 0 in 1890 
as compared with 1 was accompanied by an increase in the relative number reported as 
ending with 9 of only 0.5 per cent. On this basis, the decrease of 14.2 per cent. in the 
number of reported ages ending in 0 in 1900 as compared with 1890 would have been ac- 
oy by an increase of 0.8 per cent. in the relative number of reported ages en 
with 9. 


. 
4 
3 


83] The Census Age Question. 369 


the year, the reported age would also have been a round 
number,— that is, a multiple of five. But where the month 
of birth assigned was one of the last months of the year, the 
reported age would have been a year ending with either the 
digit 4 or the digit 9. An increase in the relative number of 
ages reported as ending in these digits is a necessary effect of 
the concentration of reported years of birth on round num- 
bers. I offer this as a probable explanation of the amount of 
this increase not otherwise accounted for, and as a possible 
explanation of substantially all of this increase not accounted 
for by the general reduction in the concentration on round 
numbered ages. 


IV. 


The thesis of Professor Bailey and Mr. Parmelee thus loses 
what had been its statistical support. But let us proceed on 
the assumption that their estimate that 2.9 per cent. in the 
decrease of the concentration on round numbers in 1900 was 
only apparent is thoroughly in accord with the facts. To 
quote the article under review: “If the [clerical] reductions 
had not been made, therefore, the excess concentration in 
1900 would have been 22.7 per cent. and not 19.8 per cent.” 
From this to the conclusion that the date of birth inquiry was 
not justified by its results is surely a non sequitur,— for the 
corresponding excess concentration in 1890 was 31.3 per cent., 
and in 1880 it was 44.8 per cent. In 1890, it should be re- 
membered, the instructions to enumerators were much more 
insistent in the warnings given against the tendency to report 
ages in round numbers than they were in either 1880 or 1900. 
I know of no cause, except the date of birth inquiry, to which 
the marked and very creditable improvement in the charac- 
ter of the age statistics in 1900 can be attributed. I have 
no confidence in the efficiency of any ‘‘general improvement 
in census methods” apart from the concrete specific steps 
which constitute this general improvement. 

It should be noted, finally, that if Professor Bailey and Mr. 
Parmelee have published the full results of their investiga- 
*If in these one-year discrepancies the reported dates of birth were in general more 


trustworthy than the reported ages, the effect of the clerical adjustments would simply 
constitute a special case, coming under the general explanation I have offered. 


370 American Statistical Association. [84 


tion they have neglected to estimate the amount of weight 
that should be given to the statistical practice of other countries, 
to the opinions of qualified experts in those countries, and to 
the recommendations of an international statistical gather- 
ing of the highest authority. They have neglected to com- 
pare the results obtained in censuses in which one form of 
question is used with the results of censuses using the other 
form. They have not reckoned with the effect of the form 
of the age inquiry upon the overstatement of the ages of 
children, upon the “general smoothness” of the age series, or 
upon the overstatement of the ages of persons advanced in 
years. They have shown, however, that the effect of the date 
of birth inquiry in the reduction of the concentration on 
round numbers was possibly not quite so great as had been 
supposed. But I see nothing to justify their final conclusion 
that “‘the inquiry as to date of birth played little or no part, 

either in increasing the accuracy of the age returns of the 

Twelfth Census, or in reducing the concentration on years 

ending in the integers 5 and 0.” And I fear that a step back- 

ward in statistical practice was taken when “especially in 

view of the [foregoing] conclusion, . . . it was decided to 

eliminate the query regarding date of birth from the 

population schedule of the Thirteenth Census and to retain 

only the question regarding age at last birthday.” 


85] 


a 
was 
fro! 
in | 
the 
Bu 
r 
rec 
to 
spl 
ob 
vis 
pit 
ho 
sp 
fl 
di 
ti 
tt 
b 
+ 


85] The New York Budget Exhibit. 371 


THE NEW YORK BUDGET EXHIBIT. 


By Lreonarp P. AYREs. 


“The City invites you to see how your money is spent,” 
was the invitation extended by New York to all of its citizens 
from October 3 to October 28. The words quoted appeared 
in great letters on a sign stretching across the entire front of 
the Tefft-Weller Building at 330 Broadway, where the 
Budget Exhibit was held with the city as host. 

The exhibit was splendidly advertised. Each taxpayer 
received with his tax bill an invitation to attend; a large sign 
to the same effect was placed on Brooklyn Bridge, and wide- 
spread publicity was given through the public press. The 
object of the city’s hospitality was to persuade its citizens to 
visit the great, graphic annual report in which for the first 
time the heads of New York’s departments showed by charts, 
pictures, object lessons and through personal representatives 
how much of the taxpayers’ money they spend, how they 
spend it, and how much they need for next year. 

Inside of the building more than an acre of space on three 
floors was devoted to illustrating the city’s activities. The 
directory on the first floor showed that these municipal activi- 
ties fell within fifty-four divisions, and that they were illus- 
trated in over 350 booths. The booths were made of green 
burlap, mounted on light scantling and serving as a back- 
ground for charts, pictures, photographs and exhibited articles. 

The significant feature of the whole affair is that the exhibit 
was held while the annual budget of the city was under con- 
sideration, while hearings on its different sections were being 
held, and before a dollar of expenditure had been voted. The 
object was to enlist the intelligent codperation of the citizens 
in the approval or rejection of the items of appropriation 
asked for by the department heads. That the endeavor to 
arouse public interest was successful is shown by the attend- 
ance, which reached 50,000 a day the first few days, and 


ight 
ler- 
of 
her 
rm 
of 
or 
In 
te q 
on 
yn 
t, 
le 
's 
n ; 
0 


372 American Statistical Association. [86 


during the entire four weeks averaged about 35,000 per day 
the first five weekdays and 70,000 on Saturdays. The entire 
number of visitors was in the vicinity of one million. 

The cost was defrayed by a city appropriation of $25,000, 
of which $3,000 was consumed by the rent of the building. 
The exhibit had its own office on the second floor which han- 
dled administrative detail and gave out press matter. 

The exhibit included a large number of object lessons made 
up of what one visitor called “the real things themselves,” 
and it was these that aroused the greatest interest. On enter- 
ing the front door the visitor was brought face to face with 
a collection of hundreds of false scales, light weights, meas- 
ures with false bottoms, and cans with double sides, which 
had been confiscated by the Bureau of Weights and Measures, 
and which never failed to make a personal appeal to the 
man who realized that he had often been the victim of these 
contrivances. 

In the basement one found “Baby,” the white horse that 
has hauled carts for the Street Cleaning Department for nine- 
teen years, “Teddy,” a horse of the same department that 
always takes the prizes in his class at horse shows, and ‘ Brent- 
wood,”’ the famous fire horse that has run to every big fire 
for fifteen years. Nearby were shown pieces of fire-fighting 
and street-cleaning apparatus, old and new, together with 
charts, maps and pictures showing what the departments 
had spent and what they had accomplished in the past, what 
they are doing now, and what support they ask for the coming 
year. 

Models and samples of all kinds were shown. Those of 
the Police Department showed the athletic apparatus used 
in the police gymnasiums, the complete equipment carried 
by each patrolman and the special outfit of the mounted 
police. The Fire Department exhibited a great floor map of 
the city, occupying as much space as a large room in a dwell- 
ing house, and showing distributed over it in the proper 
places little models of all of the fire stations and their men, 
horses and apparatus. 

The Tenement House Department showed by actual 
samples the old-law and the new-law fire escapes. The 


4 
2 
= 
5 
af 
4 
4 
3 
2 
a 
7 
| 
j 
4 


$7] The New York Budget Exhibit. 373 


Dock Department exhibited a diver’s suit and the Water 
Department showed a full size cross-section of the new aque- 
duct. The fights for clean milk and against tuberculosis 
had their places, sample traveling libraries of the Library 
Department were to be seen, and actual samples of coal, 
oats and other supplies bought by the different branches of 
the municipality, together with explanations as to cost and 
purchasing methods were prominent features. One particu- 
larly interesting exhibit was that of the Water Department, 
which showed a leaky faucet, which, it was explained would 
cost a landlord nearly six dollars a year in water rates, and a 
water closet with an invisible leak, and attached to it a water 
meter with an enlarged tell-tale hand showing that a steady 
waste of water was going on at a rate which would cost the 
householder eighteen dollars per year. 

To the student of applied statistics, it was significant that 
these and similar exhibits were the ones which most deeply 
interested the people. A study of the crowds on different 
days seemed to indicate pretty clearly that they were most 
interested in the object lessons made up of what they termed 
“real things.” After these, in the scale of effectiveness, 
came object charts made of such things as piles of paper 
boxes or upright columns, arranged to represent varying 
amounts of expenditure in different years, etc. Next came 
charts, then pictures, and last of all printed placards giving 
statements about work and expenditure. 

A very clear lesson was that in making up such an exhibit 
the cumulative fatiguing effect of looking at statistical data 
must be constantly borne in mind. This was clearly shown 
by watching the people who entered the building. Almost 
without exception, they would examine carefully charts and 
diagrams near the entrance door, and after looking at five or 
six in detail, they would walk along and merely glance at the 
rest. 

The reason is that it is a mental impossibility for anyone, 
no matter how deeply interested, to examine consecutively 
in detail several hundred complex statistical charts. The 
lesson is that those who devise charts for public exhibitions 
should strive to make the smallest possible number of truly 


. 


374 American Statistical Association. [88 


salient points in the most emphatic way. Exhibits are weak 
in proportion as they are diffuse. 

Graphic charts made in the familiar way, with lines of 
different colors indicating variations on a background of 
ordinates and abscisse seem to be beyond the comprehension 
of the average visitor. This was almost invariably the case 
when a number of lines were shown on the same chart. On 
the other hand, charts made up of contrasted surfaces, series 
of upright columns, surfaces of frequency, and so on, seemed 
to be generally appreciated and understood. 

Charts showing two contrasted sets of data, such, for 
example, as “Receipts” and “‘Expenditures,’’ seemed to be 
well understood when the contrast was indicated by using 
two colors for the two sets of data, as for example, red for 
“Receipts” and black for “Expenditures.” It was not suffi- 
cient to indicate these contrasts by words alone; chromatic 
distinction was essential. 

Judging from his actions and comments, the man from the 
street does not understand a historical diagram, showing 
variations over a series of years, unless it runs from left to 
right like the words on the ordinary printed page. It is futile 
to try to get him to read up or down or backward. Again, 
increases must be represented by rises in the lines or surfaces 
of the chart. When he reads that expenditures have gone 
up in amount, he wants to see on the chart that they have 
gone up in direction. It is hard work for him to understand 
it when he reads that they have gone up and sees that they 
have gone sideways. 

In lettering the charts, wording of the most positive, simple, 
direct sort seems to be essential. By this is meant wording 
consisting of positive statements in telegram style. People 
walking through a crowded exhibition will not stop to read 
an intricate statement or to examine the details of a complex 
chart. Inasimilar way, those charts seemed to be best under- 
stood which avoided using decimal places in showing amounts 
of money and fractions of percentages. Apparently the or- 
dinary business man does not understand a percentage figure 
carried out to several decimal places. In short, the use of 
‘round numbers’’ seems most desirable. 


4 
9 
‘et 


89] The New York Budget Exhibit. 375 


Charts containing reading matter only appeared to be 
the least effective exhibits shown. In general, the fault of 
these placards was that they contained too much. Time and 
again, men in the crowd were seen to read as far down on these 
charts as the first five lines of reading matter, after which 
they would sigh and pass on to the next. 

As an opportunity to study the psychology of a statistical 
exhibit designed to inform and influence non-technical vis- 
itors, the New York Budget Exhibit was unique and unsur- 
passed. Many of the booths were admirably arranged, and 
a large part of the charts, diagrams and object lessons shown 
were most effective. Moreover, the experience gained by 
the city employees who for the first time marshaled the salient 
facts about their own departments has been of immense value 
to themselves, and will result in even better exhibits in future 
years. 

New York City is spending each year about $160,000,000. 
She has just spent $25,000 in telling her citizens what they 
got for their money. In other words, for every dollar spent 
on municipal activities, one sixty-fifth part of one cent has 
been spent on giving a million people a highly successful 
course in applied municipal economics. On the principle 
that in every business it pays to spend enough to find 
out the essential facts about the business, this expenditure 
has been eminently wise. 

As strikingly stated by the Brooklyn Eagle, “The old sys- 
tem was government under a blanket, in which an occasional 
peep-hole had been cut to allay the suspicions of the tax- 
payer; the new system proposes government in the glare of 
a searchlight.” 


American Statistical Association. 


REVIEWS AND NOTES. 


MORTALITY STATISTICS OF THE REGISTRATION AREA OF 
THE UNITED STATES, 1910. 


It is a pleasure to be able to record the fact that at least two important 
innovations will be made in the Mortality Statistics of the Bureau of the 
Census, beginning with the report for 1910. First, mortality by days 
of age will be tabulated for the first three weeks of life and by months 
for the first two years of life. Second, detail information, including 
principal causes of death with distinction of age and sex, will be given 
for the more important elements of the foreign-born population — Irish, 
German, Italian, etc. 

These improvements in the annual mortality reports will undoubtedly 
result in a most valuable addition to our knowledge of infant mortality 
and the comparative vitality of the foreign-born. In no country is it 
more important and desirable that information of this kind be as full 
and reliable as possible than in the United States, and, on the other hand, 
in no other great civilized country have the opportunities for the collec- 
tion of the facts been more completely neglected than in this country. 

The new information which will be available with the publication of 
the Census Mortality Statistics of 1910 should prove of great interest 
and importance, for then, it is safe to say, no other country in the world 
will have so large a body of facts on the problems involved, and available 
in so convenient a form. The comparative mortality of the various 
nativity elements, particularly, will be unique, for nothing quite like it 
will be available elsewhere to students of ethnography. As an illustra- 
tion of the great need of such data in this country, it may be stated that 
they would have proven of immense benefit to the Immigration Commission 
in its recent extensive investigation; for, in spite of the great importance 
of the subject of comparative mortality of the foreign-born elements, 
almost no reliable information was ready at hand and whatever data 
were secured had to be compiled from original sources at considerable 
expense. F. 8. Crum. 


A CENTURY OF POPULATION GROWTH IN THE UNITED 
STATES. 1790-1900. 


Nations, like individuals, private or corporate undertakings and other 
institutions should occasionally take an inventory of their stock or re 
sources, so that they may measure their status against the past and be 
better able to forecast the future. Progress and decline are relative 
and involve the time or historical element, and historical reviews or sur 
veys are of great value from this viewpoint. They are also of impor- 


376 [90 


91) A Century of Population Growth. 377 


tance for the many lessons which they teach by bringing to light numerous 
facts which would either not be revealed at all otherwise, or, if so, not 
nearly so significantly as by the historical and comparative method of 
treatment. 

Mr. W. S. Rossiter, chief clerk of the Bureau of the Census, has per- 
formed a most valuable service in the report under review, for in this 
volume of 303 pages are packed facts so numerous and so significant that 
it is absolutely necessary for the student of the history of the United 
States to read and ponder its contents if he would gain the truest insight 
into the economic development of this country from 1790 to 1900. This 
statement might even be extended to include the Colonial period for 
census methods and results in the various colonies previous to 1790 are 
discussed and summarized in an admirable manner in the first chapter 
entitled ‘Population in the Colonial and Continental Periods.”” A most 
valuable feature of this first chapter and of the report, generally, is the 
reproduction of some of the old maps, now for the most part rare and 
difficult of access. Among these may be mentioned “Boston, with Its 
Environs,” ‘New York, with the Adjacent Rocks and Other Remarhable 
Parts of Hell-Gate, 1778,” “Plan of the City of New York, 1789,” and 
“Plan of Philadelphia, 1794.” Supplementing the discussion there are 
thirty-seven full pages (149-185) of tables, giving in detail the enumera- 
tions of population in North America prior to 1790. 

Chapter II is entitled ‘The United States in 1790” and in twenty-six 
pages such important topics as boundaries, area, currency, transporta- 
tion, postal service, industries, education, newspapers and periodicals, 
slavery and Indians,are discussed and, so far as may be, in the light of 
statistics gathered in the census of 1790. 

Chapter III deals briefly with the census of 1790 including the debates 
in Congress, the first Census Act, the manner in which the law was ex- 
ecuted, the enumerators’ schedules, the enumeration and the returns. 
The original purpose of this volume was to discuss the historical aspects 
of the First Census and to present such statistics as could be compiled 
from the limited and incomplete returns of the first enumeration of the 
population. As this original purpose was being carried out it was found 
that the study could be made more valuable by extending the lines of 
inquiry, by making more extended use of the historical and comparative 
method than was originally intended, and, in a word, to present a more 
or less complete survey of the population of the United States from early 
colonial times down to and including the results of the Twelfth Census, 
in 1900. 

Chapter IV deals with area and total population at each of the twelve 
census enumerations. The discussion is clarified by maps, charts and ex- 
cellently arranged tables. 

Chapter V treats of the population of counties and their subdivisions, 
also in an historical manner. A most admirable feature in this chapter 
is the outline-map method of indicating changes in county lines. Every 


| | | 


American Statistical Association. [92 


378 


state is thus treated and the information in this graphic form is, for many 
purposes, invaluable. 

Chapter VI contains new information relating to the white and negro 
populations presented in new ways. Such important facts as comparative 
natural increase, effect of immigration and the increase of white popula- 
tion of native stock are considered in this chapter. In any discussion 
of birth rates, race suicide, effect of immigration on natural increase of 
the native element, vital force of the negro element, etc., this chapter 
may be referred to both for facts and suggestions. 

Other chapters containing instructive analyses of the population 
of the United States from 1790 to 1900 are: Chapter VII, entitled 
“Sex and Age of the White Population”; Chapter VIII, ‘“ Analysis of the 
Family”; and Chapter IX, “Proportion of Children in White Popula- 
tion.” 

Chapter X, entitled “Surnames of the White Population in 1790” 
involved a very considerable amount of labor, but it will prove a store- 
house of information to students interested in name origins and deriva- 
tions. Closely related to Chapter X is Chapter XI entitled “‘ Nationality as 
Indicated by Names of Heads of Families Reported at the First Census.” 
The facts presented in this chapter, after making full allowance for neces- 
sary limitations, are of great interest and importance. Approximate 
truth is all that can be obtained in this case, but a careful study of the 
surnames indicates that the English stock contributed 83.5 per cent. of 
all the white population at the period of the first census (1790), “and if 
the Scotch and Irish be added, the British stock represented a little more 
than 90 per cent.; while the Germans contributed slightly less than 6 
per cent., and the Dutch 2 per cent.”” The careful analysis of the New 
Jersey population of 1790 by Mr. William Nelson is also to be commended. 
In that state approximately 58 per cent. were English and Welsh, 12.7 
per cent. Dutch, 9.2 per cent. German, 7.7 per cent. Scotch, 7.1 per cent. 
Irish, 2.9 per cent. Swedish and Finnish and 2.1 per cent. French. 

Interstate Migration is very briefly discussed in Chapter XII, and the 
Foreign-Born Population in Chapter XIII. The comparative statistics 
of the slave population, 1790 to 1860, are presented in Chapter XIV 
which embraces about ten pages. These statistics are presented in & 
concise and very admirable manner. It is one of the most convenient 
and valuable summaries of the statistics of slaves that the writer has 
ever seen. 

Occupations and Wealth constitutes the fifteenth and last chapter of 
the volume under review. The 1790 census contained no occupation 
schedules. For a part of Philadelphia and for Southwark, however, such 
data for heads of families were gratuitously supplied, and those statistics 
are here presented in detail. Other valuable comparisons are made i 
this chapter of occupations and wealth as recorded in the six enumera- 
tions, 1850 to 1900. 

Reference has already been made to the thirty-seven pages of general 
tables which give in detail the enumerations of populations in North 


| 
be 
4 


93 ] Wholesale Prices in Canada. 379 
America prior to 1790, and it only remains to add that the general tables 
derived from the first and subsequent censuses, 1790 to 1900, fill one 
hundred and eleven of the quarto pages of this most excellent census re- 
port. Finally, to make the volume of the greatest possible utility, there 
is an index of more than four full pages of three columns each. 

F. 8. Crum. 


WHOLESALE PRICES IN CANADA, 1890-1909. 


Special Report by R. H. Coats, B. A., associate editor of the Labour 
Gazette. Government Printing Bureau, Ottawa. 1910. pp. xiii, 509. 


For some years past the Canadian Labour Gazette has published brief 
monthly notices of significant changes in retail and wholesale prices. 
Growing popular interest in the economic problems connected with the 
recent rise of prices led to the decision of the Department of Labour to 
take up the compilation of price statistics in a more systematic and com- 
prehensive way. Since February, 1910, the Labour Gazette has contained 
monthly quotations of over thirty items entering into the cost of living, 
including the retail prices of important commodities of household con- 
sumption, together with rentals. Such items are obtained from forty- 
eight localities. The present volume is the initial installment of a compila- 
tion of wholesale prices, which it is planned to continue at regular inter- 
vals. As the investigation of wholesale prices was carried backward to 
1890 the present publication may be regarded as establishing a founda- 
tion for the future continuations, and as such is comparable to the first 
installment of the series of wholesale prices published by the United 
States Bureau of Labor.* 

The Canadian report contains the wholesale prices of 230 commodities, 
which is less by only twenty-eight than the number at present gathered 
by the American bureau (as the United States Bureau of Labor may for 
convenience’s sake be called). It is announced, moreover, that an increase 
in the number of price series may be expected in future reports. For 
the most part these prices are for the first market day of each month, 
but thirty-one series are given only in the form of annual averages. Most 
of these thirty-one series are for manufactured commodities for which 
changes in price are apt to happen infrequently. In the few cases in which 
monthly prices would have been desirable but were found impossible, we 
are assured that the yearly averages are “based in each case on expert 
opinion.” In twenty-three cases it was not found possible to begin the 
series of quotations with 1890, and there are a few other gaps and ir- 
regularities, including those resulting from the inclusion of quotations on 
several varieties of fresh fruit, which are limited, very properly, to the 
months in which such fruits are inseason. On the whole, the data of the 
report do not compare quite favorably in respect to homogeneity and 
consistency with the foundation tables of the American bureau,— the only 
other price tables fairly comparable with the Canadian tables. Even 


* Bulletin of the Department of Labor, No. 39, March, 1902. 


380 


variations. 


ville. 


for reports of this kind. 


TABLE I. 


DISTRIBUTION OF SERIES OF QUOTATIONS IN SPECIFIED GROUPS: 
REPORT ON WHOLESALE PRICES IN CANADA. 


American Statistical Association. 


this comparison is not entirely fair to the Canadian report, for the American 
tables cover a period shorter by seven years. 

It is to be hoped that in the continuations of the Canadian tables the 
practice of the American bureau in giving weekly quotations of such vari- 
able prices as those of butter, eggs, grain, live stock and meats, will be 
followed. The price on the first market day of each month may often 
be an insufficient guide to the student interested in particular price varia- 
tions, and may easily lead to misleading annual average prices for particular 
commodities, although it is not to be expected that such discrepancies 
will appreciably affect the measure of the general movement of price 
The quotations on raw cotton, raw silk and raw rubber are 
New York prices, and the quotations on furnace coke are from Connells- 
With these exceptions the prices quoted are from important Cana- 
dian wholesale markets, most frequently Montreal or Toronto. 

The sources used were those customarily drawn upon in such investiga- 
tions: trade journals, newspapers, printed reports of local exchanges and 
boards of trade and the books of manufacturers and wholesalers. One 
notes with satisfaction that quotations drawn from printed sources were 
verified so far as possible by “reference to long-established and favorably 
known business firms dealing in the articles in question.’”’ Especial care 
was used to verify newspaper quotations in this way. In respect to the full- 
ness of detail with which these sources of information are specified and 
described the Canadian report sets a new standard (and a very high one) 


[94 


Group. 


. Grains and fodder........... 


. Animals and meats.......... 


on 
a 
c 
2 


. Textiles— 


. Hides, leather, boots and shoes 


Number of 
com- 
modities. 


13 
15 


57 


11. 


| 
8. Metals and implements... | 27 
9. Fuel and lighting......... | 10 
10. Building material— 


Number of 
com- 
modities. 


Group. 


(a.) Lumber..........-- 
(b.) Miscellaneous build- 


ing materials......... | 14 

| 
(c.) Paints, oils and glass 14 
House furnishings. ....... 16 


15 


|_| | | 
I 
| | 
| 
8 
in 
th 
| th 
4 |__| pe 
3 12. Drugs and chemicals ..../ ies 
(d.) Linens.............-. 3 13. Miscellaneous— | is 
: (f.) Miscellaneous......... 2 (b.) Liquors and tobacco 4 ‘ 
4 11 (c.) Sundry 8 


95 ] Wholesale Prices in Canada. 381 


Possibly the most important criterion of the quality of such an investi- 
gation is the selection and distribution of the commodities listed. While 
relatively less significant in so inclusive a report as this one than in one 
quoting fewer commodities, it nevertheless remains a matter of prime 
importance. Table I shows the classification of commodities adopted 
for purposes of tabulation and averaging, and the number of commodities 
in each group. In Table II, I have redistributed the list of commodities 


TABLE II. 


COMPARISON OF DISTRIBUTION IN SPECIFIED GROUPS: UNITED STATES 
BUREAU OF LABOR AND CANADIAN QUOTATIONS OF WHOLESALE 
PRICES. 


Number of quotations 
in each group 


Commodities 
in Canadian 
list not in 


U.S. | Canadian | U-S. list. 


U. 8. Bureau of Labor Classifiication. 


Lumber and building materials................. 


figuring in the Canadian report into the familiar groups of the United 
States Bureau of Labor tables (without striving for absolute precision 
in the disposition made of every entry). The American list introduced 
for purposes of comparison includes only the 236 commodities for which 
the quotations throughout the period since 1890 have been for “practically 
the same description of article.” * That the two lists differ in important 
particulars is at once apparent. The most noticeable difference is in the 
group of textiles—“cloth and clothing,’"—which includes only eleven 
per cent. of the Canadian list as against twenty-six per cent. of the Amer- 
lean list. But the two lists of commodities are even more dissimilar than 
18 indicated by the differences in the relative importance given to the 
various groups. As is indicated by the figures in the third column of 
Table II, ninety-two of the commodities in the Canadian list, or forty 


* Bulletin of the Bureau of Labor, March, 1908, p. 316. 


! 
| | 
House furnishing goods....................++-- 14 16 8 
236 230 92 


382 American Statistical Association. [96 


per cent. of that entire list, are not included in the American list. After 
making due allowance for the fact that some of the Canadian groups 
contain more commodities than the corresponding American groups it 
will easily be seen that this further lack of coincidence: is relatively most 
apparent in the groups of cloth and clothing, house furnishing goods, 
miscellaneous goods, and metals and implements, in the order named. 

But the differences between the two lists are still greater than has even 
yet been indicated, for in the foregoing comparison no account is taken 
of the fact that in several instances separate quotations are given in the 
Canadian list for different grades or brands of a commodity to which but 
one series is allotted in the American list, or of the fact that one series of 
quotations in the Canadian list is in several cases represented by several 
series in the American list. The Canadian list, for example, gives three 
series for hides and one series for bacon, as against one for the former and 
two for the latter in the American list. The third column of Table II 
simply shows the number of individual series of price quotations in the 
Canadian list which are not represented in the American list by one or 
more series of quotations of similar commodities, and should be taken as 
an index rather than a measure of the lack of coincidence between the two 
lists. 

That “beggars cannot be choosers”’ has more than once stood as an 
apology for the shortcomings of compilations of price statistics, and it 
would seem easily possible that limitation of sources has been a more serious 
factor in determining the makeup of the Canadian list than that of the 
American. But this should not be taken as indicating that the Canadian 
list is necessarily the inferior one. Moreover, I am inclined to doubt 
that paucity of materials has been the controlling reason for the varia- 
tions of the Canadian list from the pattern set by the American list. 
Further reasons, which seem to be fairly sufficient in themselves, are to 
be found in (1) differences between the dominant features of industry and 
trade in Canada and in the United States, coupled with (2) adherence to 
somewhat different purposes in the compilation of the two lists. In the 
Canadian report we find, for example, a relatively larger list of farm prod- 
ucts, a relatively smaller list of manufactured staples (especially tex- 
tiles), a relatively larger list of various kinds of lumber and other building 
materials, and a noticeably larger assortment of miscellaneous articles 
important in retail, and hence in wholesale trade. In these and other 
points (such as the presence of four series of furs in the Canadian list and 
the absence of furs in the American list), the relatively immature con- 
dition of Canadian industrial life is reflected. 

This consideration gains in significance in view of the statement of the 
Report (page 3) that as the object of the investigation was “to obtain 
a result representative of cost of living and the industrial life of the com- 
munity as a whole, the plan was to embrace as many as possible of the mail 
staple articles of Canadian production and consumption consistent with 
the avoidance of duplication and the preservation of proportion as betwee 
the several divisions into which the inquiry fell.” Again, it is stated (page 


¢ 
; 


Wholesale Prices in Canada. 383 


97 


8) that ‘‘The consumption standard has formed the basis of selection; but 
the aim has been to reflect production and general trade as well.”” Asa 
matter of fact about forty-three per cent. of the commodities in the Cana- 
dian list are foods or food materials, and about thirteen per cent. may 
fairly be brought under the head of “clothing.’”” These proportions are 
very close to measuring the importance of food and clothing respectively 
as articles of consumption, as indicated by the study of workingmen’s 
budgets. In the American list, on the other hand, food and clothing 
count for thirty-four and sixteen per cent., respectively, of the total number 
of series of quotations.* That is, the Canadian list seems to satisfy the 
requirements of the consumption standard far more closely than does the 
American list. But this is hardly a mark of superiority in the Canadian 
list. The monthly statistics of “‘cost of living,” previously mentioned, 
should undoubtedly be interpreted in the light of the consumption standard, 
but a table of wholesale prices can be only indirectly useful in this way. 
Tables of other wholesale prices have other uses. They illuminate some 
of the phenomena of periods of business prosperity and depression, and 
they constitute the most important single tool of the student of the effect 
of the increasing production of gold upon prices. But for such purposes 
it is sufficient if they “reflect production and general trade” in a fairly 
adequate way. 

On general grounds, therefore, it may seem that the Canadian tables 
concede too much to the demands of the consumption standard. But a 
detailed examination of the list has served to convince me that, whether 
on account of a happy coincidence between the importance of particular 
commodities in Canadian industry and trade and their importance in 
terms of the consumption standard, or because of the careful way in which 
the dual purpose of the tables has been kept in mind by the compiler of 
the list, the Canadian tables do afford unusually excellent material for 
one who approaches the subject from the side of industry and trade. 
Averages based on so large a group of quotations are, of course, bound 
to be fairly precise in any case. But over and above the merit of inclu- 
siveness, the Canadian tables have the merit of being a really miscellaneous 
(non-specialized). group of quotations,—fairly constituting a “random 
sampling” of the multitude of commodities actually priced in the market. 

In reducing each series of price quotations to relative prices, the average 
prices of the decade 1890 to 1899 were used as the base. This facilitates 
comparisons with the relative prices of the American tables, which are 
computed on the same base. The general trend of prices is shown by’ 
simple unweighted arithmetic averages. For test purposes a weighted 
average was computed, the weights being substantially those recommended 
by the committee of the British Association in 1888. As might be ex- 
pected, the curve of weighted averages follows very closely the curve of 
unweighted averages, although it drops somewhat lower in 1897, the low 
year, and rises somewhat higher in 1907, the high year. These greater 
fluctuations (sometimes misinterpreted as “greater sensitiveness”) of 


* Prof. J. P. Norton, in Quarterly Journal of Economics, Vol. xxiv, p. 755. 


3 


384 American Statistical Association. [ 98 


the weighted average are evidently due in this case to the greater impor- 
tance assigned in it to the products of the farm, with their extreme price 
fluctuations. Unweighted index numbers are also given for each group 
and subgroup of the classification shown in Table I, above, and average 
relative prices in 1909 are given for other groupings. All these averages, 
together with the series of relative prices for each of the 230 commodities 
are shown graphically in an elaborate series of charts. 


TABLE III. 


AVERAGE PER CENTS. OF INCREASE SHOWN BY WHOLESALE PRICES IN 
CANADA IN 1909, 


Compared ‘ 
wi ec. e 

1890-1899; year 


49.9 
48.6 
33.6 
34.0 
7.6 
8.3 
14.2 
29.8 
*6.8 
*4.0 
12.5 
*4.6 
35.4 
2.1 
3.8 


28.4 
29.7 
20.7 
#11.8 
92.8 
#27.1 
«6.2 
84.5 

14.0 


34.2... .1902 
43.5... .1898 
5.9....1901 
22.6....1895 
25.7... .1898 
17.6... .1899 
45.9... .1896 
14.9....1897 
11.0... .1898 


70.2... .1898 
41.5....1897 


54.6 
28.4 35.2 
(c) Paints, oil and 5.7 | 20.9....1898 


10.1 10.4  13.2....1896 
3.9 11.3....1890 


162.6 | 127.2 | 182.2....1895 
(6) Liquors and | 23.8 | 17.5 23.8... .1890 
8.5 | 21.6  33.3....1897 


: 
Grains 85.9....1807 
Animal | 80.3....1896 
Dairy p | 48.2....1897 
Fish... | 47.9....1992 
Other f | 25.0....1897 
Textiles 15.7... .1895 
| 
(b) 
(c) 
@) 
(e) 
a Hides, | 
Metals 
| 
Building materials:— | | 
Miscellaneous:— 
* Decrease. 


Wholesale Prices in Canada. 385 


99 ] 


A comparison of the general index number for the 230 commodities 
with the similar index number computed from the American list shows 
that relative prices in Canada did not, on the average, fall quite so low 
in 1897 nor rise quite so high in 1907 as did relative prices in the United 
States (and this notwithstanding the much greater importance of agri- 
cultural products in the Canadian list). A further, and possibly less 
valid, comparison with Mr. Sauerbeck’s index number, recalculated to the 
base of average prices in the decade 1890-1899, indicates that since 1899 
the movement of prices in Canada has been about midway between the 
movement of prices in England and in the United States. But it is be- 
yond the scope of this review to even summarize the more important re- 
sults of this thoroughly praiseworthy investigation. In Table III, how- 
ever, one of the more important summary tables of the Report is reprinted. 
It may be expected that the report will be utilized in connection with the 
American tables by those interested in the effect of the tariff on the move- 
ments of particular groups of prices in the United States,—and there is 
no reason why it should not be, if due account is taken of the many and 
frequently subtle difficulties in comparisons of that kind. 

An appendix of seventy pages contains a ‘Memorandum on the con- 
struction of an index number of commodity prices, with a review of im- 
portant British and foreign index numbers, and a statement relating to 
the causes and effects of variations in prices.” This may be com- 
mended as accurate and well balanced, although it contains nothing not 
conveniently accessible elsewhere. The list of index numbers that have 
been constructed in the United States omits the important one compiled 
by Prof. John R. Commons,* as well as Prof. W. C. Mitchell’s greatly 


improved retabulations of the results of the Aldrich inquiry.T 
Autyn A. Youna. 


Two recent volumes that should be of especial interest to teachers of 
statistics are a Primer of Statistics, by W. P. and E. M. Elderton (London* 
A. and C. Black, 1910), and Mr. A. L. Bowley’s An Elementary Manual 
of Statistics (London: Macdonald and Evans, 1910). 

The Primer of Statistics is designed to carry out a suggestion of Sir 
Francis Galton (who contributes a preface to the book) to the effect that 
the elementary concepts of the modern system of biometric statistics 
might be explained in a much simpler fashion than has been usual. In 
this aim the authors have succeeded: the book is, indeed, a veritable 
primer. Frequency distributions and their important constants, such 
as the median, quartile, mode, standard deviation, and coefficient of 
correlation, are explained in a very elementary way and are illustrated 
by concrete examples of the distribution of cricket scores and of simple 
biometric data. The discussion does not penetrate into the subject far 

* Quarterly Bulletin of the Bureau of Economic Research, July, October, 1900. 


t“Gold, Prices and Wages under the Greenback Standard,” Publications of the Uni- 
versity of California, Economics, Vol. I. 


386 American Statistical Association. [100 


enough to be of service to one who wishes to know how to handle fre- 
quency distributions in actual statistical investigation. It will probably 
have achieved its purpose if it appreciably enlarges the audience to which 
an exposition of the results of the use of these methods will be intelligible. 
The authors tread on dangerous ground (for an elementary treatise) 
when, like others of their school, they suggest the abandonment of the 
use of the “probable error of an average” in favor of the “standard de- 
viation of an average.”” Their statement (p. 78) that the use of the older 
constant assumes that the frequency distribution dealt with follows the 
normal curve of error is thoroughly misleading, for it takes no account 
of the frequently found conditions under which averages of quantities 
not distributed in accordance with the normal law will themselves follow 
this law. Moreover, the “probable error” has certain advantages of 
its own. (On this general subject see the papers by Professor Edgeworth 
“On the Probable Errors of Frequency Constants,” Journal of the Royal 
Statistical Society, Vol. lxxi, Parts 2, 3, and 4.) 

Mr. Bowley’s new book differs from his well-known Elements of Statistics, 
not only in its greatly diminished mathematical apparatus, but also in 
its different purpose and scope. It may be described as a handbook of 
statistical criticism. The first part deals with the elementary statistical 
processes,—averaging, tabulation, the use of diagrams and the like, but 
with special emphasis throughout upon the necessity of recognizing and 
stating the limitations of the accuracy of the data used and of the pro- 
cesses themselves. The second part is a survey of the more important 
classes of English official statistics. How the statistics are gathered, 
what they actually mean, their inherent limitations and the kinds of 
inferences that safely may be based on them are all stated with clear- 
ness and precision and with pertinent concrete illustrations. The book 
is hardly constructive enough to serve as a formal text-book, but it is a 
good book to put into the hands of students or of others who are entering 
upon their statistical apprenticeship. Some of the fundamental criticisms 
of English statistics should be useful to all statisticians. We greatly need 
a book dealing in similar fashion with American statistics or, better yet, 


with official statistics in general. 
A. A. Y. 


4 
= 
4 
~ 


